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Preface 



This curriculum was originally developed for a senior-level optics course in the 
Department of Physics and Astronomy at Brigham Young University. Topics 
are addressed from a physics perspective and include the propagation of light in 
matter, reflection and transmission at boundaries, polarization effects, dispersion, 
coherence, ray optics and imaging, diffraction, and the quantum nature of light. 
Students using this book should be familiar with differentiation, integration, and 
standard trigonometric and algebraic manipulation. A brief review of complex 
numbers, vector calculus, and Fourier transforms is provided in Chapter 0, but it 
is helpful if students already have some experience with these concepts. 

While the authors retain the copyright, we have made this book available free 
of charge at optics.byu.edu. This is our contribution toward a future world with 
free textbooks! The web site also provides a link to purchase bound copies of the 
book for the cost of printing. A collection of electronic material related to the 
text is available at the same site, including videos of students performing the lab 
assignments found in the book. 

The development of optics has a rich history. We have included historical 
sketches for a selection of the pioneers in the field to help students appreciate 
some of this historical context. These sketches are not intended to be authorita- 
tive, the information for most individuals having been gleaned primarily from 
Wikipedia. 

The authors may be contacted at opticsbook@byu.edu. We enjoy hearing 
reports from those using the book and welcome constructive feedback. We occa- 
sionally revise the text. The title page indicates the date of the last revision. 

We would like to thank all those who have helped improve this material. We 
especially thank John Colton, Bret Hess, and Harold Stokes for their careful review 
and extensive suggestions. This curriculum benefitted from a CCLI grant from 
the National Science Foundation Division of Undergraduate Education (DUE- 
9952773). 
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Chapter 

Mathematical Tools 



Our study of optics begins with Maxwell's equations in Chapter 1. Before we start, 
look over this chapter to make sure you are comfortable with the mathematical 
tools we'll be using. The vector calculus material in section 0.1 will be used 
beginning in Chapter 1, so you should review it now. In Section 0.2 we review 
complex numbers. You have probably had some exposure to complex numbers, 
but if you are like many students, you haven't yet fully appreciated their usefulness. 
Please be warned that your life will be much easier if you understand the material 
in section 0.2 by heart. Complex notation is pervasive throughout the book, 
beginning in chapter 2. 

You may safely procrastinate reviewing Sections 0.3 and 0.4 until they come 
up in the book. The linear algebra refresher in Section 0.3 is useful for Chapter 4, 
where we analyze multilayer coatings, and again in Chapter 6, where we discuss 
polarization. Section 0.4 provides an introduction to Fourier theory. Fourier trans- 
forms are used extensively in optics, and you should study Section 0.4 carefully 
before tackling Chapter 7. 

0.1 Vector Calculus 

Each position in space corresponds to a unique vector r = xx + yy+ zz, where 
x, y, and z are unit vectors with length one, pointing along their respective axes. 
Boldface type distinguishes a variable as a vector quantity, and the use of x, y, 
and z denotes a Cartesian coordinate system. Electric and magnetic fields are 
vectors whose magnitude and direction can depend on position, as denoted by 
E (r) or B (r) . An example of such a field is E (r) = q (r - r ) / Ane 1 r - r 1 3 , which 
is the static electric field surrounding a point charge located at position r . The 
absolute-value brackets indicate the magnitude (or length) of the vector given by 




Rene Descartes (1596-1650, French) 
was born in in La Haye en Touraine 
(now Descartes), France. His mother 
died when he was an infant. His father 
was a member of parliament who en- 
couraged Descartes to become a lawyer. 
Descartes graduated with a degree in 
law from the University of Poitiers 
in 1616. In 1619, he had a series of 
dreams that led him to believe that he 
should instead pursue science. Descartes 
became one of the greatest mathemati- 
cians, physicists, and philosophers of 
all time. He is credited with inventing 
the cartesian coordinate system, which 
is named after him. For the first time, 
geometric shapes could be expressed as 
algebraic equations. (Wikipedia) 



|r-r„| = \{x-x )±+[y-y )y+ {z-z )z\ 



= y{x-x ) 2 + {y-y ) +{z-z ) 2 



(0.1) 



1 



2 
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Figure 0.1 The electric field vec- 
tors around a point charge. 



Example 0.1 

Compute the electric field at r = (2x+ 2y + 2z) A due to a positive point charge q 
positioned at r = (lx + ly+ 2z) A. 

Solution: As mentioned above, the field is given by E (r) = q (r - r ) / Ane |r - r | 3 . 
We have 

r-r = ((2-l)x+(2-l)y+(2-2)z) A = (lx+ly) A 

and 



-r | = V(1) 2 + (D 2 A =v / 2~A 



The electric field is then 



E = 



q(lx + ly) A 

Ane a {V2kf 



In addition to position, the electric and magnetic fields almost always depend 
on time in optics problems. For example, a common time-dependent field is 
E(r, t) = E cos(k-r-oif). The dot product kr is an example of vector multiplication, 
and signifies the following operation: 

kr = [k x x+ k y y+ k z z) ■ (xx+yy + zz) 

= k x x+k y y + k z z (0.2) 
= |k||r| costf) 

where </> is the angle between the vectors k and r. 
Proof of the final line of (0.2) 

Consider the plane that contains the two vectors k and r. Call it the x'y'-plane. In 
this coordinate system, the two vectors can be written as k = A;cos0x'+ fcsinSy' and 
r = r cos ax' + r sin of , where 6 and a are the respective angles that the two vectors 
make with the x'-axis. The dot product gives k-r = fcr(cos0cosa + sin0sina). 
This simplifies to k-r = krcoscp (see (0.13)), where (p = 6 - a is the angle between 
the vectors. Thus, the dot product between two vectors is the product of the 
magnitudes of each vector times the cosine of the angle between them. 



Another type of vector multiplication is the cross product , which is accom- 
plished in the following manner: 1 



ExB = 



Ex Ey E 

B x By B 



(0.3) 



= [E y B z - E z B y ) x - (E X B Z - E Z B X ) y + [E x B y - E y B x ) z 



^he use of the determinant to generate the cross product is merely a fortuitous device for 
remembering its form. 
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Note that the cross product results in a vector, whereas the dot product mentioned 
above results in a scalar (i.e. a number with appropriate units). The resultant 
vector is always perpendicular to the two vectors that are cross multiplied. If the 
fingers on your right hand curl from the first vector towards the second, your 
thumb will point in the direction of the result. The magnitude of the result equals 
the product of the magnitudes of the constituent vectors times the sine of the 
angle between them. 



Proof of cross-product properties 

We label the plane containing E and B the jc'y'-plane. In this coordinate system, the 
two vectors can be written as E = EcosOit' + EsinOy 1 and B = Bcosax' + Bsinay 7 , 
where 8 and a are the respective angles that the two vectors make with the x'-axis. 
The cross product, according to (0.3), gives ExB = i?,B (cos sin a - sin0cosa:)z'. 
This simplifies to E x B = EBsirupz! (see (0.14)), where <p = a - 6 is the angle be- 
tween the vectors. The vectors E and B, which both lie in the x'y' -plane, are both 
perpendicular to z'. If < 6 - a < n, the result E x B points in the positive z' 
direction, which is consistent with the right-hand rule. 



We will use several multidimensional derivatives in our study of optics, namely 
the gradient, the divergence, and the curl. 2 In Cartesian coordinates, the gradient 
of a scalar function is given by 



. <3f df . d/„ 
Vf{ X ,y, Z ) = ^ + J-y + J- Z 



(0.4) 



the divergence, which applies to vector functions, is given by 



dE x dE 
V-E=— - + „ 
ox ay 



y dEz 
dz 



(0.5) 



and the curl, which also applies to vector functions, is given by 



VxE: 



x y z 

dldx dldy dldz 



E x 
'dE* 
dy ' 



dE 



dz 



yl x- 



E z 
'dEz 
, dx 



dE^ 
dz 



(dE v 



y+ 



dE x \ 



, dx dy 



(0.6) 



Example 0.2 

Derive the gradient (0.4) in cylindrical coordinates defined by the transformations 
x-p cos0 and y-p sine/). (The coordinate z remains unchanged.) 

2 See M. R. Spiegel, Schaum's Outline of Advanced Mathematics for Engineers and Scientists, pp. 
126-127 (New York: McGraw-Hill 1971). 
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Figure 0.2 The unit vectors x and 
y may be expressed in terms of 
components along tp and p in 
cylindrical coordinates. 




Pierre-Simon Laplace (1749-1827, 
French) was born in Normandy, France 
to a farm laborer. Some wealthy neigh- 
bors noticed his unusual abilities and 
took an interest in his education. 
Laplace is sometimes revered as the 
"Newton" of France with contributions 
to mathematics and astronomy. The 
Laplacian differential operator as well as 
Laplace transforms are used widely in 
applied mathematics. (Wikipedia) 



Solution: By inspection of Fig. 0.2, the cartesian unit vectors may be expressed as 

x = cos tpp - sin tptp and y = sintpp + costptp 

In accordance with the rules of calculus, the needed partial derivatives expressed 
in terms of the new variables are 



■ + 



dx \dxj dp \dxj dtp 



dtp] d d 
— I — and — 



He] A+W A 

dy \dy) dp [dyj d(p 



Meanwhile, the inverted form of the coordinate transformation is 
p-Jx 2 + y 2 and tp-tan~ l ylx 



from which we obtain the following derivatives: 
dp x 



dx ^ x 2 + y 2 
dp y 



= COS0 

sintp 



dy y/x^+f 
Putting this all together, we arrive at 

dx dy ^ dz 



d(p 
dx 

d(p x 
dy x 2 + y 2 



y smtp 
x 2 + y 2 p 

X COStp 



costp 



df simp df 

dp p dtp 

j df cos tpdf 

+ sinrf>-^- + 

{ v dp p dtp 

df ldf~ df 
■ — p+ -— <p+ — Z 
dp p dtp dz 



(cos tpp - sin tptp) 

~> d f - 
[smtpp + cos tptp) + — z 



where we have used cos tp + sin tp-l (see Ex. 0.4) . 

We will sometimes need a multidimensional second derivative called the 
Laplacian. When applied to a scalar function, it is denned as the divergence of a 
gradient: 



V 2 f{x,y,z) = V-[Vf{x,y,z)] 



In cartesian coordinates, this reduces to 

V 2 f[x,y,z) 



d 2 f d 2 f d 2 f 
dx 2 dy 2 dz 2 



(0.7) 



(0.8) 



Since the Laplacian applied to a scalar gives a result that is also a scalar, in Carte- 
sian coordinates we deal with vector functions by applying the Laplacian to the 
scalar function attached to each unit vector: 

d 2 E v 



V 2 E = 



(d 2 E x d 2 E x d 2 EA 
+ -^r + 



[ dx 2 
(d 2 E z 



dy 2 
d 2 E z 



'd 2 E v 



dz 2 j 
d 2 E z \ 



x + 



dx 2 dy 



dz 2 



(0.9) 



, dx 2 dy 2 dz 2 , 
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This is possible because each unit vector is a constant in Cartesian coordinates. 

The various multidimensional derivatives take on more complicated forms 
in non-cartesian coordinates such as cylindrical or spherical. You can derive the 
Laplacian for these other coordinate systems by changing variables and rewriting 
the unit vectors starting from the above Cartesian expression. (See Problem 0.10.) 
Regardless of the coordinate system, the Laplacian for a vector function can be 
obtained from first derivatives though 

V 2 E = V(V-E)-Vx (VxE) (0.10) 



Verification of (0.10) in Cartesian coordinates 

From (0.6), we have 

V xE = 

and 

V x (V x E) = 



(dE z 


dz ) 


(dE z 




(dE y 


dEA 


[dy 


[ dx 


v dx 


' dy ) 



dldx 
(dEz _ dEy] 
{ dy dz j 



y 

dldy 

I SE Z dE x 
I dx dz 



dldz 



fdt 1 _dE A ] 
( dx dy I 



I dEy 


dE x 




dE x \ 




'dEy 


dEA 




dE z 


dEyU 


[ dx 




' dz \ dx 


dz J 


his 


\ dx 


dy 1 


- — 

dz 


. dy 


dz j\ 



\_d_(dEz 


dE x ) 


d (dE z 


dEyU 


[ dx \ dx 


dz ) 


dy { dy 


dz j\ 



After adding and subtracting -^fx + -g^ff + -^rz and then rearranging, we 
get 

d 2 E v 



VX(VKE): 



■-E x | d-Ey | d 2 E z 
dx 2 dxdy dxdz 



d 2 E x 



dx 2 dy 2 



d 2 E x 
' dz 2 



d z E x ^ d<-E y 
dxdy dy 2 

d 2 E y d 2 E y 
~dx Y + ~dy T 



d 2 E z 
dydz 

d 2 E y 
"dz 2 ' 



y+ 



d 2 E x 



d 2 E z 



dxdz dydz dz 2 



d 2 E z d 2 E z 



dx 2 dy 2 



d 2 E z 
dz 2 



After some factorization, we obtain 

V x (V x E) = 

= V (V ■ E) - V^E 

where on the final line we invoked (0.4), (0.5), and (0.8). 



L <3 .3 d ](dE x dE y dE z 
[ dx ^ dy dz\[ dx dy dz 



dx 2 dy 2 dz 2 



[E x ±+ E y y+ E z z] 



We will also encounter several integral theorems 3 involving vector functions. 
The divergence theorem for a vector function F is 



^ ¥ -n da = J 



V-Fdv 



(0.11) 



3 For succinct treatments of the divergence theorem and Stokes' theorem, see M. R. Spiegel, 
Schaum's Outline of Advanced Mathematics for Engineers and Scientists, p. 154 (New York: McGraw- 
Hill 1971). 
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Figure 0.3 The function F (red 
arrows) plotted for several points 
on the surface S. 



The integration on the left-hand side is over the closed surface S, which contains 
the volume V associated with the integration on the right-hand side. The unit 
vector n points outward, normal to the surface. The divergence theorem is espe- 
cially useful in connection with Gauss' law, where the left-hand side is interpreted 
as the number of field lines exiting a closed surface. 

Example 0.3 

Check the divergence theorem (0. 1 1) for the vector function F [x, y, z) = y 2 x+ xyf+ 
x 2 zz. Take as the volume a cube contained by the sixplanes \x\ = +1, \y\ - ±1, and 
|z| = ±l. 

Solution: First, we evaluate the left side of (0.11) for the function: 
11 11 11 

^F-hda- J J dxdy{x 2 z) z=1 - J J dxdy{x 2 z) z= _ 1 + J J dxdz[xy) y=1 

S -1-1 -1-1 -1-1 

11 11 11 

-J J dxdz{xy) y= _ l + J J dydz{y 2 ) x=1 - J J dydz[y 2 ) x= _ x 

-i-i -i-i 
11 11 

= 2 J J dxdyx 2 + 2 J J dxdzx = 4 ^ 



i-i 

2I 1 



-1-1 -1-1 

Now we evaluate the right side of (0.11): 

ill l 
J V-Fdv- J J J dxdydz[x+ x 2 ] = 4 J dx[x+x 2 ] =4 

V -1-1-1 -1 

Another important theorem is Stokes' theorem : 
J (V x¥)-nda = jv-d£ 



+ 4— =- 
-1 2 U 3 



? 3 1 1 
x x 

~2 + y 



_ 8 
i~3 



(0.12) 



The integration on the left-hand side is over an open surface S (not enclosing a 
volume). The integration on the right-hand side is around the edge of the surface. 
Again, h is a unit vector that always points normal to the surface . The vector d£ 
points along the curve C that bounds the surface S. If the fingers of your right 
hand point in the direction of integration around C, then your thumb points 
in the direction of h. Stokes' theorem is especially useful in connection with 
Ampere's law and Faraday's law. The right-hand side is an integration of a field 
around a loop. 



0.2 Complex Numbers 

It is often convenient to represent electromagnetic wave phenomena (i.e. light) as 
a superposition of sinusoidal functions, each having the form A cos [a + The 
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sine function is intrinsically present in this formula via the identity 
cos [a + /3) = cos a cos /3 - sin a sin /3 



(0.13) 



This is a good formula to commit to memory, as well as the frequently used 
identity 

sin(a + f$) = sin a cos /3 + sin ft cos a (0.14) 

With a basic familiarity with trigonometry we can approach many optical 
problems including those involving the addition of multiple waves. However, the 
manipulation of trigonometric functions via identities such as (0.13) and (0.14) 
can be cumbersome and tedious. Fortunately complex-number notation offers 
an equivalent approach with far less busy work. The modest investment needed to 
become comfortable with complex notation is definitely worth it; optics problems 
can become cumbersome enough even with the most efficient methods! 

The convenience of complex-number notation has its origins in Euler's for- 
mula: 



cos(/>+ ismcj) 



(0.15) 



where i = v -1 is an imaginary number. By inverting Euler's formula (0.15) we 
can obtain the following representation of the cosine and sine functions: 

e i<p + e -'<P 
COS(f> = , 

(0.16) 



-i<p 



sin0 : 



2i 



Equation (0.16) shows how ordinary sines and cosines are intimately related to 
hyperbolic cosines and hyperbolic sines. If (p happens to be imaginary such that 
(p = iy where y is real, then we have 

e~r - e r 



sinzy : 



cos zy : 



2 i 
e~r + e r 



= z'sinhy 
coshy 



(0.17) 



Proof of Euler's formula 



We can prove Euler's formula using a Taylor's series expansion: 



1 df 
f{x)-f (x ) + tt (X - x ) — 
1! ax 



1 2 d 2 f 



(0.18) 



By expanding each function appearing in (0.15) in a Taylor's series about the origin 
we obtain 

cb 2 rf> 4 

cosrf) =1 h 

r 2! 4! 



(b s (b J 

i sinrf) = i(b-i hi — 

r r 3! 5! 



(0.19) 




Leonhard Euler (1707-1783, Swiss) 
was born in Basel, Switzerland. His 
father, Paul Euler, was friends with 
the well-known mathematician Johann 
Bernoulli, who discovered young Euler's 
great talent for mathematics and tu- 
tored him regularly. Euler enrolled at 
the University of Basel at age thirteen. 
In 1726 Euler accepted an offer to join 
the Russian Academy of Sciences in 
St Petersburg, having unsuccessfully 
applied for a professorship at the Uni- 
versity of Basel. Under the auspices of 
the Czars (with the exception of 12-year- 
old Peter II), foreign academicians in 
the Russian Academy were given con- 
siderable freedom to pursue scientific 
questions with relatively light teaching 
duties. Euler spent his early career in 
Russia, his mid career in Berlin, and 
his later career again in Russia. Euler 
introduced the concept of a function. 
He successfully defined logarithms and 
exponential functions for complex num- 
bers and discovered the connection to 
trigonometric functions. The special 
case of Euler's formula e' n + 1 = has 
been voted by modern fans of math- 
ematics (including Richard Feynman) 
as "the Most Beautiful Mathematical 
Formula Ever" for its single uses of 
addition, multiplication, exponentia- 
tion, equality, and the constants 0, 1, 
e, i and %. Euler and his wife, Katha- 
rina Gsell, were the parents of 13 chil- 
dren, many of whom died in childhood. 
(Wikipedia) 
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Brook Taylor (1685-1731, English) was 
born in Middlesex, England. He studied 
at Cambridge as a fellow-commoner 
earning a bachelor degree in 1709 
and a doctoral degree in 1714. Soon 
thereafter, he developed the branch 
of mathematics known as calculus of 
finite differences. He used it to study 
the movement of vibrating strings. As 
part of that work, he developed the for- 
mula known today as Taylor's theorem, 
which was under-appreciated until 1772, 
when French mathematician Lagrange 
referred to it as "the main foundation of 
differential calculus." (Wikipedia) 



The last line of (0.19) is seen to be the sum of the first two lines, from which Euler's 
formula directly follows. 



Example 0.4 

Prove (0.13) and (0.14) as well as cos 2 cf> + sin 2 cf> = 1 by taking advantage of (0.16). 

Solution: We start with (0.13). By direct application of (0.16) and some rearranging 
we have 



cos a cos p - sin a sin /3 = 



g ia + £ -ia £ ip + e ~ip g ia _ g -ia g ip _ e ~if. 



2 2 2i 2i 

e i(a+p) + e i{a-p) + g -i{a-p) + e ~i(a+p) 

4 

e i{a+p) _ e i{a-p) _ e ~i{a-(S) + £ -i(a+P) 



e i{a+P) + e -i(a+p) 
2 

We can prove (0.14) using the same technique: 
sin a cos /3 + sin (5 cos a 



■ cos [a + /5) 



2i 2 + 2i 2 

e i(a+P) + e i{a-p) _ & -i{a-p) _ e ~i{a+p) 
4i 

e i(a+j6) _ e i{a-f3) + e ~i{a-p) _ £ -i(a+p) 



4i 



,i(a+p) _ p -i{a+p) 



2i 



■ sin (a + /3) 



Finally, for cos 2 (p + sin 2 <p — 1 we have 



: e i4 + e -i<P\ 2 ( e i<P- e -i<P\ 2 

cos (p + sin cj) = I : I + I 



„2i<l 



2i 

+ 2 + £T 2, > e 2, > - 2 + e" 2 ^ 



4 



= 1 



As was mentioned previously, we will often be interested in waves of the 
form A cos [x + a). We can use complex notation to represent this wave simply by 
writing 

Acos(a + /3) =Re|Ae' a | (0.20) 

where the phase /3 is conveniently contained within the complex factor A = Ae l P. 
The operation Re { } means to retain only the real part of the argument without 
regard for the imaginary part. As an example, we have Re {1 + 2i} = 1. The formula 
(0.20) follows directly from Euler's equation (0.15). 
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It is common (even conventional) to omit the explicit writing of Re { }. Thus, 
physicists participate in a conspiracy that Ae m actually means A cos [a + /3). This 
laziness is permissible because it is possible to perform linear operations on 
Re {/} such as addition, differentiation, or integration while procrastinating the 
taking of the real part until the end: 



Re)/(+Re(g( = Re(/ + ? } 
Jne{f} dx = Re|J fdx 



(0.21) 



As an example, note that Re{l + 2/} + Re{3 + 4/} = Re{(l + 2/) + (3 + 4/)} = 4. 
However, we must be careful when performing other operations such as multi- 
plication. In this case, it is essential to take the real parts before performing the 
operation. Notice that 



Re{/}xRe{g}^Re{/xg} 



(0.22) 



As an example, we see Re{l + 2/} x Re {3 + 4/} = 3, but Re{(l + 2/) (3 + 4/)} = -5. 

When dealing with complex numbers it is often advantageous to transform 
between a Cartesian representation and a polar representation. With the aid of 
Euler's formula, it is possible to transform any complex number a+ ib into the 
form pe 1 ^, where a, b, p, and (p are real. From (0.15), the required connection 
between [p,(p) and {a, b) is 



pe 1 ^ = pcoscf>+ ipsincf) = a + ib 



(0.23) 



The real and imaginary parts of this equation must separately be equal. Thus, we 
have 

a = pcoscf) 
b = psiri(p 

These equations can be inverted to yield 



(0.24) 



p = Va 2 + b 2 
= tan i - 



(0.25) 



(a>0) 



When a < 0, we must adjust (p by n since the arctangent has a range only from 

-7T/2 to 7T/2. 

The transformations in (0.24) and (0.25) have a clear geometrical interpreta- 
tion in the complex plane, and this makes it easier to remember them. They are 
just the usual connections between Cartesian and polar coordinates. As seen in 
Fig. 0.4, p is the hypotenuse of a right triangle having legs with lengths a and b, 
and (p is the angle that the hypotenuse makes with the x-axis. Again, you should 
be careful when a is negative since the arctangent is defined in quadrants I and 



Gerolamo Cardano (1501-1576, Italian) 
was the first to introduce the notion 
of complex numbers (which he called 
"fictitious") while developing solutions 
to cubic and quartic equations. He was 
born in Pavia, Italy, the illegitimate son 
of a lawyer who was an acquaintance 
of Leonardo da Vinci. Cardano was for- 
tunate to survive infancy as his father 
claimed that his mother attempted to 
abort him and his older siblings all died 
of the plague. Cardano studied at the 
University of Pavia and later at Padua. 
He was known for being eccentric and 
confrontational, which did not earn him 
many friends. He supported himself 
in part as a somewhat successful gam- 
bler, but he was often short of money. 
Cardano also introduced binomial co- 
efficients and the binomial theorem. 
(Wikipedia) 



a+ib 




Figure 0.4 A number in the com- 
plex plane can be represented 
either by Cartesian or polar repre- 
sentation. 
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-3 + 4 




Figure 0.5 Geometric representa- 
tion of -3 + 4i 



IV An easy way to deal with the situation of a negative a is to factor the minus 
sign out before proceeding (i.e. a+ ib = - {-a - ib)). Then the transformation 
is made on -a- ib where -a is positive. The overall minus sign out in front is 
just carried along unaffected and can be factored back in at the end. Notice that 
-pe i( t> is the same as pe'^ ±n ^ . 

Example 0.5 

Write -3 + 4i in polar format. 

Solution: We must be careful with the negative real part since it indicates a quad- 
rant (in this case II) outside of the domain of the inverse tangent (quadrants I and 
IV). Best to factor the negative out and deal with it separately. 



-3+4i = -(3-4i) = -v / 3 2 + (-4) 2 e i ' 



= e ln 5e- 



- 1 I = f) 



Finally, we consider the concept of a complex conjugate. The conjugate of a 
complex number z = a + i b is denoted with an asterisk and amounts to changing 
the sign on the imaginary part of the number: 



z* = [a+ ib)* = a- ib 



(0.26) 



The complex conjugate is useful when computing the absolute value of a complex 
number: 

\z\ = \Fz*z = \/{a- ib) (a + ib) = V a 2 + b 2 = p (0.27) 

Note that the absolute value of a complex number is the same as its magnitude p 
as defined in (0.25). The complex conjugate is also useful for eliminating complex 
numbers from the denominator of expressions: 



a+ib {a + ib) (c - id) ac + bd + i{bc- ad) 



c+id {c+id){c-id) 



■ + d 2 



(0.28) 



No matter how complicated an expression, the complex conjugate is calcu- 
lated by inserting a minus sign in front of all occurrences of i in the expression, 
and placing an asterisk on all complex variables in the expression. For example, 
the complex conjugate of pe 1 ^ is pe~'^ assuming p and (p are real, as can be seen 
from Euler's formula (0.15). As another example consider 



[E Q ex\){i (Kz-a)t)}]* = E* exp{-z' [K* z-(ot)\ 



(0.29) 



assuming z, a), and t are real, but E and K are complex. 

A common way of obtaining the real part of an expression is by adding the 
complex conjugate and dividing the result by 2: 



1 



Re{z} = - [z + z*] 



(0.30) 
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Notice that the expression for cos(p in (0.16) is an example of this formula. Some- 
times when a lengthy expression is added to its own complex conjugate, we let 
"C.C." represent the complex conjugate in order to avoid writing the expression 
twice. 

In optics we sometimes encounter a complex angle, , such as Kz in (0.29). The 
imaginary part of K governs exponential decay (or growth) when a light wave 
propagates in an absorptive (or amplifying) medium. Similarly when we compute 
the transmission angle for light incident upon a surface beyond the critical angle 
for total internal reflection, we encounter the arcsine of a number greater than 
one in an effort to satisfy Snell's law. Even though such an angle does not exist in 
the physical sense, a complex value for the angle can be found, which satisfies 
(0.16) and describes evanescent waves. 

0.3 Linear Algebra 

Throughout this book we will often encounter sets of linear equations. (They 
are called linear equations because they represent lines in a plane or in space.) 
Most often, there are just two equations with two variables to solve. The simplest 
example of such a set of equations is 

Ax + By = F and Cx + Dy = G (0.31) 

where x and y are variables. A set of linear equations such as (0.31) can be 
expressed using matrix notation as 



A B 




X 




Ax + By 




F 


CD 




y . 




Cx + Dy 




G 



(0.32) 



As seen above, the 2x2 matrix multiplied onto the two-dimensional column 
vector results in a two-dimensional vector. The elements of rows are multiplied 
onto elements of the column and summed to create each new element in the 
result. A matrix can also be multiplied onto another matrix (rows multiplying 
columns, resulting in a matrix). The order of multiplication is important; matrix 
multiplication is not commutative. 

To solve a matrix equation such as (0.32), we multiply both sides by an inverse 
matrix, which gives 



A 


B 


-l 


' A 


B 




X 




A 


B ' 


-l 


' F 


C 


D 




C 


D 




. y 




c 


D 




l G 



(0.33) 



The inverse matrix has the property that 



A 


B 


-l 


' A 


B 




1 





C 


D 




^ C 


D 







1 



(0.34) 



where the right-hand side is called the identity matrix. You can easily check that 
the identity matrix leaves unchanged anything that it multiplies, and so (0.33) 
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X 




A 


B 


-l 


F 


. y . 




c 


D 




G 



simplifies to 



Once the inverse matrix is found, the matrix multiplication on the right can be 
performed and the answers for x and y obtained as the upper and lower elements 
of the result. 

The inverse of a 2 x 2 matrix is given by 



A B 
C D 



1 



A B 
C D 



D 

-C 



-B 
A 



(0.35) 



where 



A B 
C D 



= AD-CB 



is called the determinant. We can check that (0.35) is correct by direct substitution: 



A 


B ' 


-l 


A 


B 


C 


D 




C 


D 



1 



AD-BC 
1 

AD-BC 

1 
1 



D 

-C 



-B 
A 



AD-BC 




A B 
C D 



AD-BC 



(0.36) 




James Joseph Sylvester (1814-1897, 
English) made fundamental contribu- 
tions to matrix theory, invariant theory, 
number theory, partition theory and 
combinatorics. He played a leadership 
role in American mathematics in the 
later half of the 19th century as a pro- 
fessor at the Johns Hopkins University 
and as founder of the American Journal 
of Mathematics. (Wikipedia) 



The above review of linear algebra is very basic. In contrast, we next dis- 
cuss Sylvester's theorem, which you probably have not previously encountered. 
Sylvester's theorem is useful when multiplying the same 2x2 matrix (with a de- 
terminate of unity) together many times (i.e. raising the matrix to a power). This 
situation occurs when modeling periodic multilayer mirror coatings or when 
considering light rays trapped in a laser cavity as they reflect many times. 

Sylvester's Theorem: 4 If the determinant of a 2 x 2 matrix is one, (i.e. AD-BC= 1) 
then 



A 


B ' 


N _ 1 


C 


D 


sinf? 



where 



AsinN6-sm(N-l)d BsinNB 

CsinNB D sin NO - sin {N- 1)0 



1 



cosfj = -{A + D) 



(0.37) 



(0.38) 



4 The theorem presented here is a specific case. See A. A. Tovar and L. W. Casperson, "Generalized 
Sylvester theorems for periodic applications in matrix optics," J. Opt. Soc. Am. A 12, 578-590 (1995). 
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Proof of Sylvester's theorem by induction 

When N = 1, the equation is seen to be correct by direct substitution. Next we 
assume that the theorem holds for arbitrary N, and we check to see if it holds for 
N+l: 



A 


B 


N+1 


1 


A 


B 




AsmN9-sm[N-l)9 


B sin N9 


C 


D 




sinS 


C 


D 




C sin N9 


DsmN9-sm{N-l)0 



1 

sinS 



[A 2 + BC) sin N9 - A sin (N - 1) 9 [AB + BD) sin N9 - B sin (N - 1) 9 



(AC + CD) sin NB - C sin (IV - 1 ) 9 (D 2 + B C) sin NO - D sin (IV - 1 ) 9 
Now we inject the condition AD - BC -I into the diagonal elements and obtain 



1 

sinfl 



[A 2 + AD-1) siniVfl - Asm [N- 1) 6 B[{A + D) siniVfl - sin [N - 1) 9] 
CKA+D) sin N8 - sin (IV- 1)0] (D 2 + AD - l) sin N9-D sin (N - 1) 9 



and then 

1 f A[(A + D)siniVe-sin(iV-l)0] -siniV0 B [(A + D) sinNO - sin (AT - 1)0] I 

sin0 [ C[(A + D)sinAT0-sin(iV-l)0] D [{A + D) sinN0 - sin (IV- 1) 0] - sinW0 | 

In each matrix element, the expression 

(A + D) sin NO = 2 cosfl sin JV0 = sin (AT + 1) 9 + sin (IV - 1) 6 (0.39) 

occurs, which we have rearranged using cos0 = | (A + D) while twice invoking 
(0.14). The result is 



^sin(AT+ l)0-sinIV0 £sin(iV+l)0 

Csin(IV+l)0 Dsin(AI+l)0-sinIV0 



A 


B 


N+1 


1 


C 


D 




sin0 



which completes the proof. 



0.4 Fourier Theory 

In optics, it is common to decompose complicated light fields into a superposi- 
tion of pure sinusoidal waves. This is called Fourier analysis. 5 This is important 
since individual sine waves tend to move differently through optical systems (say, 
a piece of glass with frequency dependent index). After propagation through a 
system, we can also reassemble sinusoidal waves to see the effect on the over- 
all waveform. In fact, it will be possible to work simultaneously with infinitely 
many sinusoidal waves, where the frequencies comprising a light field are spread 
continuously over a range. Fourier transforms are also helpful for diffraction 
problems where many waves (all with the same frequency) interfere spatially. 



5 See Murray R. Spiegel, Schaum's Outline of Advanced Mathematics for Engineers and Scientists, 
Chaps. 7-8 (New York: McGraw-Hill 1971). 
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We begin with a derivation of the Fourier integral theorem. As asserted by 
Fourier, a periodic function can be represented in terms of sines and cosines in 
the following manner: 




Joseph Fourier (1768-1830, French) 
was born to a tailor in Auxerre, France. 
He was orphaned at age eight. Because 
of his humble background, which closed 
some doors to his education and career, 
he became a prominent supporter of the 
French Revolution. He was rewarded 
by an appointment to a position in the 
Ecole Polytechnique. In 1798, partici- 
pated in Napoleon's expedition to Egypt 
and served as governor over lower Egypt 
for a time. Fourier made significant con- 
tributions to the study of heat transfer 
and vibrations (presented in 1822), and 
it was in this context that he asserted 
that functions could be represented as a 
series of sine waves. (Wikipedia) 



f(t)= £ a n cos(nAa)t) + b n sm(nAa)t) 
n=0 



(0.40) 



This is called a Fourier expansion. It is similar in idea to a Taylor's series (0.18), 
which rewrites a function as a polynomial. In both cases, the goal is to represent 
one function in terms of a linear combination of other functions (requiring a 
complete basis set). In a Taylor's series the basis functions are polynomials and 
in a Fourier expansion the basis functions are sines and cosines with various 
frequencies (multiples of a fundamental frequency). 

By inspection, we see that all terms in (0.40) repeat with a maximum period 
of 2n//S.a). In other words, a Fourier series is good for functions where fit) = 
f{t + 2nl Aai). The expansion (0.40) is useful even if f{t) is complex, requiring a n 
and b n to be complex. 

Using (0.16), we can rewrite the sines and cosines in the expansion (0.40) as 



fit) = £ 



oo „inka>t _j_ „-in\wt 



„in Ah) t _ g-itiAojt 



77 = 



Lan i 
z e 



2i 

n=\ z 



(0.41) 



or more simply as 



where 



/(f) = £ c n e- inA0Jt 

n=-oo 



(0.42) 



Cn<0 = 
Cn>0 = 



a- n — ib- 
2 

cifi + ibfi 



(0.43) 



Co — a 

Notice that if c_„ = c* for all n, then /(f) is real (i.e. real a n and b n ); otherwise 
/(f) is complex. The real parts of the c„ coefficients are connected with the cosine 
terms in (0.40), and the imaginary parts of the c n coefficients are connected with 
the sine terms in (0.40). 

Given a known function /(f), we can compute the various coefficients c n . 
There is a trick for figuring out how to do this. We multiply both sides of (0.42) by 
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e imA(0t , where m is an integer, and integrate over the function period 2n/\a): 



TilAw 



f f{t)e imAwt dt = £ c„ [ e i{m - n)AMt dt 



-Til Am 



-Til Am 



oo r e i{m-n)Aa)t i ^ /Aw 



„ = _oo Lifm-fllAd) 
00 27TC 



= L 



■-n 



n=-oo 
oo 



Aw 



-;r/A(i) 
gi(m-ri)Ti _ „-i(m-n)n 

2i {m-n)n 



(0.44) 



= E 



2nc n sin[(m- n)n] 
Am (m-n)n 



The function sin [{m -n)n}l [{m - n) n] is equal to zero for all n ^ m, and it is 
equal to one when n = m (to see this, use L'Hospital's rule on the zero-over-zero 
situation, or just go back and re perform the above integral for n = m). Thus, only 
one term contributes to the summation in (0.44). We now have 



Cm — 



271 



Til All) 



/ 



f{t)e imAtot dt 



(0.45) 



-Til Ad) 



from which the coefficients c n can be computed, given a function f(t). (Note that 
m is a dummy index so we can change it back to n if we like.) 

This completes the circle. If we know the function /(f), we can find the 
coefficients c n via (0.45), and, if we know the coefficients c n , we can generate the 
function f{t) via (0.42). If we are feeling a bit silly, we might combine these into a 
single identity: 



Aw 
2n 



Til All) 



in Aii) t 



dt 



-Til Aid 



-inAii)t 



(0.46) 



We start with a function /(f) followed by a lot of computation and obtain the 
function back again! (This is not quite as foolish as it first appears, as we will 
discuss later.) 

As mentioned above, Fourier expansions represent functions /(f) that are 
periodic over the interval 2n/Aa). This is disappointing since many optical wave- 
forms do not repeat (e.g. a single short laser pulse) . Nevertheless, we can represent 
a function /(f) that is not periodic if we let the period 2nl i\ti) become infinitely 
long. In other words, we can accommodate non-periodic functions if we take the 
limit as Acj goes to zero so that the spacing of terms in the series becomes very 
fine. Applying this limit to (0.46) we obtain 



^ oo 

/(f) = — lim V 
2n Aw-0„ = _ oo 



-inAii)t 



oo 

//(<■)■ 



,inAti)t' 



dt' 



Aw 



(0.47) 
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At this point, a brief review of the definition of an integral is helpful to better 
understand the next step that we shall administer to (0.47). 

Changing the summation in (0.47) over to an integral 

Recall that an integral is really a summation of rectangles under a curve with finely 
spaced steps: 

b b-a 
Aw 



/Aw 
g{(i))d(x>= lim y g{a+ nAco) Au> 



(0.48) 

b-a 

2Aw i a+b 



2A<u a+b I 
lim > g h nAio Aw 



n=- 



2Aw 



The final expression has been manipulated so that the index ranges through both 
negative and positive numbers. If we set a = - b and take the limit b^oo, then the 
above expression becomes 



/ 



OO 

g{w)du)= lim V g{nA(i>)A(!) (0.49) 



This concludes our short review of calculus. 

Now, (0.47) has the same form as (0.49) if g (nAoi) represents everything in 
the square brackets of (0.47) . The result is the Fourier integral theorem: 



oo oo 



do (0.50) 



The piece in brackets is called the Fourier transform, and the rest of the operation 
is called the inverse Fourier transform. The Fourier integral theorem (0.50) is often 
written with the following (potentially confusing) notation: 

oo 

ib)t 



\/2n J 



f {<!)) = -— f{t)e lt>l dt 



-oo 

oo 



(0.51) 



s/2n J 



/(« = -= I f[<o)e- ,ut do) 



The transform and inverse transform are also sometimes written as f{(o) = 
&{f{t)} and /(f) = {/M}. Note that the functions /(f) and /M are en- 
tirely different, even taking on different units (e.g. the latter having extra units of 
per frequency) . The two functions are distinguished by their arguments, which 
also have different units (e.g. time vs. frequency). Nevertheless, it is customary to 
use the same letter to denote either function since they form a transform pair. 
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You should be aware that it is arbitrary which of the expressions in (0.51) is 
called the transform and which is called the inverse transform. In other words, the 
signs in the exponents of (0.51) may be interchanged (and this convention varies 
in published works!) . Also, the factor 2n may be placed on either the transform or 
the inverse transform, or divided equally between the two as has been done here. 

Example 0.6 

Compute the Fourier transform of E{t) = Eoe~ t2/2T2 e~' mt followed by the inverse 
Fourier transform. 

Solution: According to (0.51), the Fourier transform is 

oo oo 

E (Cl » = -L f ( Eo e- t2l2T2 e- i0) '> t )e i ' at dt=^L ( e -^^<^t dt 
V2n J \ 1 J2n J 



The integration can be performed with the help of (0.55), which yields 



Cg-gor 



I-Uo)- /: " . n .,<■ ■( " : ) = TE e rt(U 



Similarly, the inverse Fourier transform of the above function is 

oo oo 

E[t) = — f [TE e- T2 ^-" 0)2l2 )e- iMt da>=^- [ e - T 4^ 2 ^-^- T 4< doj 



where again we use (0.55) to obtain 



(ggO-it) T 2 



v^V T 2 I2 
which brings us back to where we started. 

As was previously mentioned, it would seem rather pointless to perform 
a Fourier transform on the function /(f) followed by an inverse Fourier trans- 
form, just to end up with /(£) again. Instead, we will typically apply a frequency- 
dependent effect on / [a>) before performing the inverse Fourier transform. In 
this case, the final function will be different from / (f). Keep in mind that / (w) is 
the continuous analog of the discrete coefficients c n (or the a n and b n ). The real 
part of / (o») indicates the amplitudes of the cosine waves necessary to construct 
the function /(f). The imaginary part of / (w) indicates the amplitudes of the sine 
waves necessary to construct the function / (f). 

Finally we comment on the Dime delta function, 6 which is defined indirectly 
through 

oo 

f(t)= j f{t')5{t'-t)dt' (0.52) 



6 See G. B. Arfken and H. J. Weber, Mathematical Methods for Physicists 6th ed., Sect. 1.15 (San 
Diego: Elsevier Academic Press 2005). 
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The delta function 8 [t' - t) is zero everywhere except att'=t where it is infinite 
in such a way as to make the integral take on the value of the function /(f). (You 
can think of 8 [t' - t) dt' as an infinitely tall and infinitely thin rectangle centered 
at t' = t with an area unity.) The integral only pays attention to the value of / ( t') 
at the point t' = t. 

A remarkable attribute of the delta function can be seen from the Fourier 
integral theorem. After rearranging the order of integration, the Fourier integral 
theorem (0.50) can be written as 



oo oo 

f(t)= f f[t') l- jeW-Qda 



dt' 



(0.53) 



A comparison of (0.52) and (0.53) shows that you may write the delta function 
as a uniform superposition of all frequency components: 



S[t'-t) = ^ t J e^'-V dm 



(0.54) 



Example 0.7 

Use (0.54) to prove Parseval's relation: 7 

oo oo 

f l/»l 2 dw= f \f(t)\ 2 dt 

—oo —oo 

which comes up often in the study of optics. 
Solution: 

oo oo 

J \ f{(i))\ 2 doj = J f{to)f* (a)) dco 

-oo —oo 

— oo v — oo J L -oo J 

The order of integration can be changed to give 

oo oo oo [" oo ^ 

J \f{(»)\ 2 dw= J J fWf*i- t ')\^ t f eW-t-Vdaiidtdt? 

-oo -oo -oo t -oo J 

oo oo 

= J J f{t)f*{-t')8{t'-{-t])dtdt' 

— oo — oo 
oo oo 

= f f(.t)f*{t)dt= J \f{t)\ 2 dt 



7 For a more general version of the relation, see G. B. Arfken and H. J. Weber, Mathematical 
Methods for Physicists 6th ed., Sect. 15.5 (San Diego: Elsevier Academic Press 2005) . 
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Equation (0.54) was used to reach the final result. 
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Appendix 0.A Table of Integrals and Sums 

The following formulas are useful for various problems encountered in the text. 

oo 

j e -ax 2 +bx+c dx= ^L e £;+c (Re{fl}>0) (0.55) 

-oo 
oo 

f e iax J n\b\ ..j,, 

7 -jdx=^e- labl {b>0) (0.56) 

J l + x 2 lb z 2 
o 

'in 

J e ±iacos{d-8>) dQ = 2nJo {a] (Q 57) 





a 



C a 

I J {bx)xdx = -rhiab) (0.58) 



b 

o 

°° -tf 14a 



J e~ ax2 J {bx) xdx= 6 - (0.59) 



2a 





00 

2{ 



sin-(n.v) , n 





dx= — (0.60) 



(ax) 2 2a 
dy y 



! 



/ [y 2 + cf 2 ~ c^ff^l 

/ 



(0.61) 

dx 1 1 \fc 

sin" 1 f- (0.62) 



x\l x 1 - c \fc \ x 

71 71 



J sin(ax) sin(fox) dx = J cos(ax) cos(fox) dx = ^5 a b (a, b integer) (0.63) 

Y, r " = —■ j (0-64) 

n=0 1 ~ r 

V r n = — - (0.65) 

£i 1 - r 

00 ^ 

E r " = ; — ( r < d (°- 66 ) 

n=0 1 r 
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Exercises 

Exercises for 0.1 Vector Calculus 

PO.l Let r = (x+ 2y- 3z) m and r = (-x+ 3y+ 2z) m. 

(a) Find the magnitude of r. 

(b) Findr-r . 

(c) Find the angle between r and r . 

Answer: (a) r = \Zl4m; (c) 94°. 

P0.2 Use the dot product (0.2) to show that the cross product E x B is per- 
pendicular to E and to B. 

P0.3 Verify the "BAC-CAB" rule: A x (B x C) = B (A • C) - C (A • B) . 

P0.4 Prove the following identity: 

1 (r-r') 
r |r-r'| |r-r'| 3 ' 

where V r operates only on r, treating r' as a constant vector. 

P0.5 Prove that V r • v r 3 is zero, except at r = r' where a singularity situation 
occurs. 

P0.6 Verify V • (V x f) = for any vector function f . 

P0.7 Verify V x (f x g) = f (V • g) - g (V • f) + (g ■ V) f - (f ■ V) g. 

P0.8 Verify V • (f x g) = g- (V x f) - f • (V x g). 

P0.9 Verify V-(gf) =f-Vg + gV-f and V x (gf) = (Vg) xf+gVxf. 

P0.10 Show that the Laplacian in cylindrical coordinates can be written as 

v 2 -IA[ 9 ) 1 1 92 1 92 

pdpydpj p 2 d(p 2 dz 2 



Solution: (Partial) 

Continuing with the approach in Example 0.2, we have 

dx 2 { dx 2 ) dp dx dp dx [ dx 2 j d(p dx dip dx 

\dx 2 )dp dxdp[[dxjdp \dx) dip] [dx 2 j dtp dx dip [ { dx j dp U;J<3</>| 
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and 




V 2 f: 



a 2 / a 2 / a 2 / 

dx 2 dy 2 dz 2 

dtp &p)af ,(W 2 

dx 2 3y 2 J(?p 1 1 ax 
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dyl I dp 2 [\dxl\dx 



3 2 / , 3 2 / 



d<p\ldp 



dy I \ dy j \ dipdp 



d(f> 2 dz 2 

The needed first derivatives are given in Example 0.2. The needed second derivatives are 



a 2 p 


1 


x 2 sin 2 (p 


dx 2 ~ 


^x 2 + y 2 


{x 2 + y 2 } 312 P 


d 2 (p 


2xy 


2sin0cos</> 


dx 2 ' 


[x 2 + y 2 f 


P 2 


d 2 p 


1 


y 2 cos 2 <f> 


dy 2 " 


v^ 2 + y 2 


{x 2 + y 2 f 2 P 


d 2 (p 


2xy 


2sin0cos</> 


dy 2 ~ 


(x 2 + y 2 ) 2 


P 2 



Finish the derivation by substituting these derivatives into the above expression. 

P0.11 Verify Stokes' theorem (0.12) for the function given in Example 0.3. 
Take the surface to be a square in the xy-plane contained by |x| = ±1 
and | y| = ±1, as illustrated in Fig. 0.6. 

P0.12 Verify the following vector integral theorem for the same volume used 
in Example 0.3, but with F = y 2 xx + xyz and G = x 2 x: 



Figure 0.6 



J [F (V • G) + (G- V) F] dv = j> F (G- n) dt 



P0.13 Use the divergence theorem to show that the function in P0.5 is 4n 
times the three-dimensional delta function 

5 3 [r'-r) = S{x!-x)8{y'-y]8{z'-z) 

which has the properly that 

1 if V contains r' 



J 8 3 [r'-r)dv = 

v 



otherwise 



Solution: We have by the divergence theorem 
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From P0.5, the argument in the integral on the right-hand side is zero except at r = r'. Therefore, 
if the volume V does not contain the point r = r', then the result of both integrals must be zero. 
Let us construct a volume between an arbitrary surface Si containing r = r'and S2, the surface 
of a tiny sphere centered on r = r'. Since the point r = r' is excluded by the tiny sphere, the result 
of either integral in the divergence theorem is still zero. However, we have on the tiny sphere 

^ \^-Ada = - J j j 4; J sintpdtpda = -4n 

s 2 l r_r 'l ^ ' 

Therefore, for the outer surface Si (containing r = r') we must have the equal and opposite 
result: 



Si 

This implies 



Ir-rf 



J |r-r'| 3 I 



■ Ada = 4ti 



4k if V contains r' 
otherwise 



The integrand exhibits the same characteristics as the delta function Therefore, V r ■ ^ r r I 

I r-r' I 

4jiS 3 (r-r'). The delta function is defined in (0.52) 



Exercises for 0.2 Complex Numbers 

P0.14 Using only a calculator's arithmetic and trigonometric functions, com- 
pute z\ - Z2 and Z\ I Z2 in both rectangular and polar form for z\ = 1 - i 
and z 2 = 3 + 4 1 . 

P0.15 Show that 

a ~ ib =g -2itan- 1 | 

a+ ib 

regardless of the sign of a, assuming a and b are real. 

P0.16 Invert (0.15) to get both formulas in (0.16). HINT: You can get a second 
equation by considering Euler's equation with a negative angle -(p. 

P0.17 Show Re {A} x Re {B} = {AB + A*B) I A + C.C. 

P0.18 If £ = l£ol e iSE and B = \B \e i5B , and if k, z, a>, and t are all real, prove 

Re{E e i{kz - (Ot] }Re{B e i{kz - (t>t) } = ± [E* B + E B* ) 

+ ^\E \\B \cos[2{kz-a)t)+8 E + 8 B ] 

P0.19 (a) If sin0 = 2, show that cos(/> = i\/3. HINT: Use sin 2 (p + cos 2 0=1. 
(b) Show that the angle in (a) is nl2 - i In (2 + \/3). 

P0.20 Write Acos(wf) + 2Asm{a)t + n/4) as simple phase-shifted cosine wave 
(i.e. find the amplitude and phase of the resultant cosine wave). 
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Exercises for 0.4 Fourier Theory 

P0.21 Prove that Fourier Transforms have the property of linear superposi- 
tion: 

& {ag it) + bh it)} = ag M + bh M 
where g(w) = ^{g{f)} and h{to) = &{h{t)}. 
P0.22 Prove^{g(af)} = ^g^). 

P0.23 Prove & {g{t - t)} = g(w)e iOT . 

P0.24 Show that the Fourier transform of E(t) = E e~ (tlT)2 coso) t is 



E{(o) = 



TE ( i^of 



2s/2 



e «t 2 + e 4/t2 



P0.25 Take the inverse Fourier transform of the result in P0.24. Check that it 
returns exactly the original function. 

P0.26 The following operation is referred to as the convolution of the func- 
tions g{t) and h{t): 



git)®h(t)\ T = J g{t)h{r-t)dt 



A convolution measures the overlap of g{t) and a reversed h{t) as a 
function of the offset t. The result is a function of t. 

(a) Prove the convolution theorem: 

&{g{t) ® h{t)\ T }\^ = V2ng{(o)h[a)) 

(b) Prove this related form of the convolution theorem: 

^{g(f)fc(f)}h = g(w') ® /i(w')L 

V27T 



00(00 



Solution: Part (a) 

^< J g(flfe(T-f)dO =-|=J < J g(t)h(T-t)dt\e i0JT dT (LetT = f' + f) 

l-oo J w -oo l-oo J 

oo oo 

= */■/■ gco^fy'^'wr' 

— oo— oo 

oo oo 

= V^-L f gMe'^dr-L \ h(t')e io ><" dt' 



/27T 

— UU 

/2ng {to) h (to) 
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P0.27 Prove the autocorrelation theorem: 

J? j j h(t)h*(t-T)dt^ = V2n\h{co)\ 2 

P0.28 (a) Compute the Fourier transform of a Gaussian function, g(f) = 
e~ c l2T . Do the integral by hand using the table in Appendix O.A. 

(b) Compute the Fourier transform of a sine function, h{t) = sma) t. 
Do the integral by hand using sin(x) = [e lx - e~ lx )l2i, combined with 
the integral formula (0.54). 

(c) Use your results to parts (a) and (b) and a convolution theorem from 
P0.26(b) to evaluate the Fourier transform /(f) = e~^ l2T ' 1 sino» it. (The 
answer should be similar to P0.24) . 

(d) Plot /(f) and the imaginary part of its Fourier transform for the 
parameters a>o = l and T = 8. 



Chapter 1 

Electromagnetic Phenomena 



In 1861, James Maxwell assembled the various known relationships of electricity 
and magnetism into a concise 1 set of equations: 2 

V • E = — (Gauss's Law) (1.1) 

e 

V-B = (Gauss's Law for magnetism) (1.2) 

dB , , 

V x E = - — (Faraday's Law) (1.3) 

dt 

B <9E 

Vx — = e hj (Ampere's Law revised by Maxwell) (1.4) 

Mo dt 

Here E and B represent electric and magnetic fields, respectively. The charge 
density p describes the charge per volume distributed through space. 3 The current 
density J describes the motion of charge density (in units of p times velocity). The 
constant e is called the permittivity, and the constant p is called the permeability. 
Taken together, these are known as Maxwell's equations. 

After introducing a key revision of Ampere's law, Maxwell realized that together 
these equations comprise a complete self-consistent theory of electromagnetic 
phenomena. Moreover, the equations imply the existence of electromagnetic 
waves, which travel at the speed of light. Since the speed of light had been 
measured before Maxwell's time, it was immediately apparent (as was already 
suspected) that light is a high-frequency manifestation of the same phenomena 
that govern the influence of currents and charges upon each other. Previously, 
optics had been considered a topic quite separate from electricity and magnetism. 
Once the connection was made, it became clear that Maxwell's equations form 
the theoretical foundations of optics, and this is where we begin our study of light. 



*In Maxwell's original notation, this set of equations was hardly concise, written without the 
convenience of modern vector notation or V. His formulation wouldn't fit easily on a T-shirt! 

2 See J. D. Jackson, Classical Electrodynamics, 3rd ed., p. 1 (New York: John Wiley, 1999) or the 
back cover of D. J. Griffiths, Introduction to Electrodynamics, 3rd ed. (New Jersey: Prentice-Hall, 
1999). 

3 Later in the book we use p for the radius in cylindrical coordinates, not to be confused with 
charge density. 
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1.1 Gauss' Law 



r-r 




Origin 



Figure 1.1 The geometry of 
Coulomb's law for a point charge 




Origin 



Figure 1.2 The geometry of 
Coulomb's law for a charge dis- 
tribution. 



The force on a point charge q located at r exerted by another point charge q' 
located at r' is 

F=<?E(r) (1.5) 

where 

q> (r-r') 



E(r) = 



(1.6) 



47re |r-r'|" 

This relationship is known as Coulomb's law. The force is directed along the 
vector r-r', which points from charge q' to q as seen in Fig. 1.1. The length or 
magnitude of this vector is given by |r - r'| (i.e. the distance between q' and q). 
The familiar inverse square law can be seen by noting that (r - r') /|r - r' | is a unit 
vector. We have written the force in terms of an electric fleldE{r), which is defined 
throughout space (regardless of whether a second charge q is actually present). 
The permittivity e amounts to a proportionality constant. 

The total force from a collection of charges is found by summing expression 
(1.5) over all charges q' n associated with their specific locations r'„. If the charges 
are distributed continuously throughout space, having density p (r') (units of 
charge per volume), the summation for finding the net electric field at r becomes 
an integral: 

(r-r') 



E(r 



= — f p(r') 
47ie J ri 1 



-/|3 



dv' 



(1.7) 



This three-dimensional integral 4 gives the net electric field produced by the 
charge density p distributed throughout the volume V. 

Gauss' law (1.1), the first of Maxwell's equations, follows directly from (1.7) 
with some mathematical manipulation. No new physical phenomenon is intro- 
duced in this process. 5 



Derivation of Gauss' law 

We begin with the divergence of (1.7): 



V-E(r) 



— [ pMv, 



(r-r') 



_n3 



dv' 



(1.8) 



The subscript on V r indicates that it operates on r while treating r', the dummy 
variable of integration, as a constant. The integrand contains a remarkable mathe- 
matical property that can be exploited, even without specifying the form of the 



4 Here dv' stands for dx'dy'dz' and r' = x'x + y' y + z' z (in Cartesian coordinates). 

5 Actually, Coulomb's law applies only to static charge configurations, and in that sense it is 
incomplete since it implies an instantaneous response of the field to a reconfiguration of the 
charge. The generalized version of Coulomb's law, one of lefimenko's equations, incorporates 
the fact that electromagnetic news travels at the speed of light. See D. J. Griffiths, Introduction 
to Electrodynamics, 3rd ed., Sect. 10.2.2 (New lersey: Prentice-Hall, 1999). Ironically, Gauss' law, 
which can be derived from Coulomb's law, holds perfectly whether the charges remain still or are in 
motion. 



1.2 Gauss' Law for Magnetic Fields 
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charge distribution p[r'). In modern mathematical language, the vector expression 
in the integral is a three-dimensional delta function (see (0.52) : 6 

(r-r'l 

V r • -i ^ = 4nS 3 (r' -r) = And ix' -x)S fv' -y)S [z 1 - z) (1.9) 

|r-r'| 

A derivation of this formula is addressed in problem P0.13. The delta function 
allows the integral in (1.8) to be performed, and the relation becomes simply 



V-E(r) = 




which is the differential form of Gauss' law (1.1). 



The (perhaps more familiar) integral form of Gauss' law can be obtained by 
integrating (1.1) over a volume V and applying the divergence theorem (0.11) to 
the left-hand side: 

jE{r) -n da= — j p{r) dv (1.10) 

s v 

This form of Gauss' law shows that the total electric field flux extruding through a 
closed surface S (i.e. the integral on the left side) is proportional to the net charge 
contained within it (i.e. within volume V contained by S). 



Example 1.1 

Suppose we have an electric field given by E = (ax 2 y 3 x + /3z 4 y) cosw t . Use Gauss' 
law (1.1) to find the charge density p{x,y,z, t). 



Solution: 



d d d 



p = e Q V ■ E - e 1 x. v y vz — I [ax 2 y 3 x + Pz 4 y) coscot = 2e Q axy 5 cos cot 

ox ay oz) 



1 .2 Gauss' Law for Magnetic Fields 

In order to 'feel' a magnetic force, a charge q must be moving at some velocity (call 
it v). The magnetic field arises itself from charges that are in motion. We consider 
the magnetic field to arise from a distribution of moving charges described by a 
current density J (r') throughout space. The current density has units of charge 
times velocity per volume (or equivalently current per cross sectional area). The 
magnetic force law analogous to Coulomb's law is 

F=gvxB (1.11) 

6 For a derivation of Gauss' law from Coulomb's law that does not rely directly on the Dirac delta 
function, see J. D. Jackson, Classical Electrodynamics 3rd ed., pp. 27-29 (New York: John Wiley, 
1999). 




Figure 1.3 Gauss' law in integral 
form relates the flux of the elec- 
tric field through a surface to the 
charge contained inside that sur- 
face. 




Carl Friedrich Gauss (1777-1855, Ger- 
man) was born in Braunschweig, Ger- 
many to a poor family. Gauss was a 
child prodigy, and he made his first sig- 
nificant advances to mathematics as a 
teenager. In grade school, he purport- 
edly was asked to add all integers from 
1 to 100, which he did in seconds to the 
astonishment of his teacher. (Presum- 
ably, Friedrich immediately realized that 
the numbers form fifty pairs equal to 
101.) Gauss made important advances 
in number theory and differential geome- 
try. He developed the law discussed here 
as one of Maxwell's equations in 1835, 
but it was not published until 1867, af- 
ter Gauss' death. Ironically, Maxwell 
was already using Gauss' law by that 
time. (Wikipedia) 
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where 



B(r) = ^fj(r')x-^- r lrf,' 
4ttJ 1 ; |r-r'| 3 



Jean-Baptiste Biot (1774-1862, 
French) was born in Paris. He attended 
the Ecole Polytechnique where mathe- 
matician Gaspard Monge recognized his 
academic potential. After graduating, 
Biot joined the military and then took 
part in an insurrection on the side of 
the Royalists. He was captured, and 
his career might of have met a tragic 
ending there had Monge not success- 
fully pleaded for his release from jail. 
Biot went on to become a professor 
of physics at the College de France. 
Among other contributions, Biot partic- 
ipated in the first hot-air balloon ride 
with Gay-Lussac and correctly deduced 
that meteorites that fell on L'Aigle, 
France in 1803 came from space. Later 
Biot collaborated with the younger Felix 
Savart (1791-1841) on the theory of 
magnetism and electrical currents. They 
formulated their famous law in 1820. 
(Wikipedia) 



(1.12) 



The latter equation is known as the Biot-Savart law. The permeability jU dictates 
the strength of the magnetic field, given the current distribution. 

As with Coulomb's law, we can apply mathematics to the Biot-Savart law 
to obtain another of Maxwell's equations. Nevertheless, the essential physics 
is already inherent in the Biot-Savart law. 7 Using the result from P0.4, we can 
rewrite (1.12) as 8 

B(r) = -^fj(r')xV r ^-^'=^Vx f i^L dv' (1.13) 
47T J v ' |r — r'| An J |r-r'| 

v v 

Since the divergence of a curl is identically zero (see P0.6), we get straight away 
the second of Maxwell's equations (1.2) 

V-B = 

which is known as Gauss' law for magnetic fields. (Two equations down; two to 
go.) 

The similarity between V - B = and V E = p/e , Gauss' law for electric fields, 
is immediately apparent. In integral form, Gauss' law for magnetic fields looks the 
same as (1.10), only with zero on the right-hand side. If one were to imagine the 
existence of magnetic monopoles (i.e. isolated north or south 'charges'), then the 
right-hand side would not be zero. The law implies that the total magnetic flux 
extruding through any closed surface balances, with as many field lines pointing 
inwards as pointing outwards. 



Example 1.2 

The field surrounding a magnetic dipole is given by 

B = p[3xzx + 3yzy+(3z 2 -r 2 )z]/r 



where r = \J x 2 + y 2 + z 2 . Show that this field satisfies Gauss' law for magnetic 
fields (1.2). 



7 Like Coulomb's law, the Biot-Savart law is incomplete since it also implies an instantaneous 
response of the magnetic field to a reconfiguration of the currents. The generalized version of the 
Biot-Savart law, another of Jefimenko's equations, incorporates the fact that electromagnetic news 
travels at the speed of light. Ironically, Gauss' law for magnetic fields and Maxwell's version of 
Ampere's law, derived from the Biot-Savart law, hold perfectly whether the Currents are steady or 
vary in time. The Jefimenko equations, analogs of Coulomb and Biot-Savart, also embody Faraday's 
law, the only of Maxwell's equations that cannot be derived from the usual forms of Coulomb's law 
and the Biot-Savart law. See D. J. Griffiths, Introduction to Electrodynamics, 3rd ed., Sect. 10.2.2 
(New Jersey: Prentice-Hall, 1999). 

8 Note that V r ignores the variable of integration r'. 
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Solution: 



V-B = /3 

= P 
= P 



d ixz\ d iyz\ 

3— -E- +3— + 
dx I r 5 I dyW 5 ' 



8 I3z 2 1 



dz\ r 



, z 5xzdr\ ( z 5yzdr\ j 6z 15z 2 dr 3 dr 
31 — -— — +3 — l + l — — — + 



r 6 dx 



r 6 dy 



r 6 dz r 4 dz 



12z 15z f dr dr dr 

■ + y— + z 



dx dy dzj r 4 dz 



3 dr 



The necessary derivatives are dr/dx — x j y/x 2 + y 2 + z 2 = x/r, dr/dy = y/r, and 
drldz = zlr, which lead to 



V-B = £ 



12z 15z 3z 



= 



1.3 Faraday's Law 

Michael Faraday discovered that changing magnetic fields induce electric fields. 
This distinct physical effect, called induction, can be observed when a magnet is 
waved by a loop of wire. Faraday's law says that a change in magnetic flux through 
a circuit loop (see Fig. 1.4) induces a voltage around the loop according to 



^E-d£ = -—jB-nda 



(1.14) 



The right side describes a change in the magnetic flux through a surface and the 
left side describes the voltage around the loop containing the surface. 

We apply Stokes' theorem (0.12) to the left-hand side of Faraday's law and 
obtain 

j (VxE)-nda=-— Jb-A da or J|vxE+^ nda = (1.15) 

s s s 

Since this equation is true regardless of what surface is chosen, it implies 

dB 

VxE = 

dt 

which is the differential form of Faraday's law (1.4) (three of Maxwell's equations 
down; one to go) . 



Example 1.3 

For the electric field given in Example 1.1, E = (ax 2 y 3 x+/3z 4 y) coswf, use Faraday's 
law (1.3) to findB(x,y,z, t). 



Michael Faraday (1791-1867, English) 
was one of the greatest experimental 
physicists in history. Born on the out- 
skirts of London, his family was not well 
off, his father being a blacksmith. The 
young Michael Faraday only had access 
to a very basic education, and so he 
was mostly self taught and never did 
acquire much skill in mathematics. As 
a teenager, he obtained a seven-year 
apprenticeship with a book binder, dur- 
ing which time he read many books, 
including books on science and electric- 
ity. Given his background, Faraday's 
entry into the scientific community was 
very gradual, from servant to assistant 
and eventually to director of the labo- 
ratory at the Royal Institution. Faraday 
is perhaps best known for his work that 
established the law of induction and 
for the discovery that magnetic fields 
can interact with light, known as the 
Faraday effect. He also made many ad- 
vances to chemistry during his career 
including figuring out how to liquify 
several gases. Faraday was a deeply re- 
ligious man, serving as a Deacon in his 
church. (Wikipedia) 




Magnet 
Figure 1.4 Faraday's law. 
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Andre-Marie Ampere (1775-1836, 
French) was born in Lyon, France. The 
young Andre-Marie was tutored in Latin 
by his father, which gave him access to 
the mathematical works of Euler and 
Bernoulli to which he was drawn at 
an early age. When Ampere reached 
young adulthood, French revolutionaries 
executed his father. In 1799, Ampere 
married Julie Carron, who died of ill- 
ness a few years later. These tragedies 
weighed heavy on Ampere through- 
out his life, especially because he was 
away from his wife during much of their 
short life together, while he worked as 
a professor of physics and chemistry in 
Bourg. After her death, Ampere was 
appointed professor of mathematics at 
the University of Lyon and then in 1809 
at the Ecole Polytechnique in Paris. Af- 
ter hearing that a current-carrying wire 
could attract a compass needle in 1820, 
Ampere quickly developed the theory of 
electromagnetism. (Wikipedia) 



Solution: 



<5B 

— = -V x E = - coswf 
dt 



X y z 

A A A. 

dx dy dz 

ax 2 y 3 fiz 4 



= -COSfclf 



d d , d d , 9 

x— (0) - x— (/3z 4 ) - j>— (0) + y— {ax 2 y 3 ) 
dy dz ox dz 



+z—{pz 4 )-z—{ax 2 y 3 ) 
dx VH ' dy y y ' 



- (4/3z 3 x + 3ax 2 y 2 z) cosiot 
Integrating in time, we get 



B = [4pz i x+3ax 2 y 2 zj 



smwt 



CO 



plus possibly a constant field. 



1.4 Ampere's Law 



The Biot-Savart law (1.12) can also be used to derive Ampere's law. Ampere's law 
is merely the inversion of the Biot-Savart law (1.12) so that J appears by itself, 
unfettered by integrals or the like. 

Inversion of Biot-Savart Law 

We take the curl of (1.12): 



V x B (r) = — f V r 

471 J 



fr-r'l 
J(r')" 1 



|r-r'| 3 



dv' 



(1.16) 



We next apply the differential vector rule from P0.7 while noting that J (r') does not 
depend on r so that only two terms survive. The curl of B (r) then becomes 



VxB(r): 



Hi 

An 



(r-rO 



Ir-r'| 



,,3 



Ir-r'| 



,|3 



(1.17) 



According to (1.9), the first term in the integral is 47iJ (r') <5 3 (r' - r), which is easily 
integrated. To make progress on the second term, we observe that the gradient can 
be changed to operate on the primed variables without affecting the final result 
(i.e. V r — ► -V r ;). In addition, we take advantage of a vector integral theorem (see 
P0.12) to arrive at 

V xBW = HoHr) - t±> [V, .J M ] d * + gft±) W ) -n] da > 

v s 

(1.18) 
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The last term in (1.18) vanishes if we assume that the current density J is com- 
pletely contained within the volume V so that it is zero at the surface S. Thus, the 
expression for the curl of B (r) reduces to 



4tt J |r-r'| 



(1.19) 



The latter term in (1.19) vanishes if 

V • J = (steady-state approximation) (1.20) 

in which case we have succeeded in isolating J and obtained Ampere's law. 



Without Maxwell's correction, Ampere's law 

V x B = ,UoJ 



(1.21) 



only applies to quasi steady-state situations. The physical interpretation of Am- 
pere's law is more apparent in integral form. We integrate both sides of (1.21) over 
an open surface S, bounded by contour C and apply Stokes' theorem (0.12) to the 
left-hand side: 

jB{r)-d( = iJ-oj]{r)-nda = iJ, I (1.22) 

c s 

This law says that the line integral of B around a closed loop C is proportional to 
the total current flowing through the loop (see Fig. 1.5). The units of J are current 
per area, so the surface integral containing J yields the current / in units of charge 
per time. 

1 .5 Maxwell's Adjustment to Ampere's Law 

Maxwell was the first to realize that Ampere's law was incomplete as written in 
(1.21) since there exist situations where V • J ^ (especially the case for optical 
phenomena). Maxwell figured out that (1.20) should be replaced with 

dp 

V-J = --^ (1.23) 
at 

This is called the continuity equation for charge and current densities. Simply 
stated, if there is net current flowing into a volume there ought to be charge piling 
up inside. For the steady-state situation inherently considered by Ampere, the 
current into and out of a volume is balanced so that dp jdt - 0. 



Derivation of the Continuity Equation 

Consider a volume of space enclosed by a surface S through which current is 
flowing. The total current exiting the volume is 



da 



(1.24) 




Figure 1.5 Ampere's law. 
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James Clerk Maxwell (1831-1879, 
Scottish) was born to a wealthy family 
in Edinburgh, Scotland. Originally, his 
name was John Clerk, but he added his 
mother's maiden name when he inher- 
ited an estate from her family. Maxwell 
was a bright and inquisitive child and 
displayed an unusual gift for mathe- 
matics at an early age. He attended 
Edinburgh University and then Trin- 
ity College at Cambridge University. 
Maxwell started his career as a professor 
at Aberdeen University, but lost his job 
a few years later during restructuring, 
at which time Maxwell took a post at 
King's College of London. Maxwell is 
best known for his fundamental contri- 
butions to electricity and magnetism 
and the kinetic theory of gases. He 
studied numerous other subjects, includ- 
ing the human perception of color and 
color-blindness, and is credited with pro- 
ducing the first color photograph. He 
originally postulated that electromag- 
netic waves propagated in a mechanical 
'luminiferous ether'. He founded the 
Cavendish laboratory at Cambridge in 
1874, which has produced 28 Nobel 
prizes to date. Maxwell, one of Ein- 
stein's heros, died of stomach cancer in 
his forties. (Wikipedia) 



where n is the outward normal to the surface. The units on this equation are that 
of current, or charge per time, leaving the volume. 

Since we have considered a closed surface S, the net current leaving the enclosed 
volume V must be the same as the rate at which charge within the volume vanishes: 



dt 



J p dv 



(1.25) 



Upon equating these two expressions for current, as well as applying the diver- 
gence theorem (0.11) to the former, we get 

jvjdv= - j^dv or J( V " I+ ^| dv=0 ( L26 ) 

V V V 

Since (1.26) is true regardless of which volume V we choose, it implies (1.23). 



Maxwell's main contribution (aside from organizing other people's formulas 9 
and recognizing them as a complete set of coupled differential equations — a big 
deal) was the injection of the continuity equation (1.23) into the derivation of 
Ampere's law (1.19). This yields 



VxB = ^ J + 



Mo d f , 



(r-r') 



_n3 



dv' 



(1.27) 



Then substitution of (1.7) into this formula gives 

B dE 

VX — =J + £- Q — 

Ho dt 

the last of Maxwell's equations (1.4). 

This revised Ampere's law includes the additional term e dE/dt, which is 
known as the displacement current (density). The displacement current exists 
even in the absence of any actual charge density p. 10 It indicates that a changing 
electric field behaves like a current in the sense that it produces magnetic fields. 
The similarity between Faraday's law and the corrected Ampere's law (1.4) is 
apparent. No doubt this played a part in motivating Maxwell's work. 

In summary, in the previous section we saw that the basic physics in Ampere's 
law is present in the Biot-Savart law. Infusing it with charge conservation (1.23) 
yields the corrected form of Ampere's law. 



Although Gauss developed his law in 1835, it was not published until after his death in 1867, 
well after Maxwell published his laws of electromagnetism, so in practice Maxwell accomplished 
much more than merely fixing Ampere's law. 

10 Based on (1.27), one might think that the displacement current e dEldt ought to be zero in a 
region of space with no charge density p. However, in (1.27) p appears in a volume integral over a 
region of space sufficiently large (consistent with a previous supposition) to include any charges 
responsible for the field E; presumably, all fields arise from sources. 
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Example 1.4 

(a) Use Gauss's law to find the electric field in a gap that interrupts a current- 
carrying wire, as shown in Fig. 1.6. 

(b) Find the strength of the magnetic field on contour C using Ampere's law applied 
to surface Si. 

(c) Show that the displacement current in the gap leads to the identical magnetic 
field when using surface S2 . 

Solution: (a) We'll assume that the cross-sectional area of the wire A is much wider 
than the gap separation. Then the electric field in the gap will be uniform, and the 
integral on the left-hand side of (1.10) reduces to EA since there is essentially no 
field other than in the gap. If the accumulated charge on the 'plate' is Q, then the 
right-hand side of (1.10) integrates to Q/eo, and the electric field turns out to be 
E=QI(e A). 

(b) Let the contour C be a circle at radius r. The magnetic field points around the 
circumference with constant strength. The left-hand side of (1.22) becomes 2nrB 
while the right-hand side is 

dQ 




Figure 1.6 Charging capacitor. 



Mo I 1- fida - Hoi = Ho 



dt 



This gives for the magnetic field 



Mo dQ 
2nr dt 



(c) If instead we use the displacement current e dEI d t in place of 1 in in the right- 
hand side of right-hand side of (1.22), we get for that piece 



Mo 



ft dE) dE dQ 

I \ £ °~dt \' nda = Vo£o — A = Ho^; 



dt 



dt 



which is the same as before. 



Example 1.5 

For the electric field E = {ax 2 y 3 x + /3z 4 y) coswf (see Example 1.1) and the as- 
sociated magnetic field B = (4/3 z 3 x + 3ax 2 y 2 z) sa ^ >t (see Example 1.3), find the 
current density J{x, y, z, t). 



Solution: 



B dE sintot 

J=Vx eo^- = 

Mo dt Mow 



x y z 
AAA 

dx dy dz 

4/3z 3 3ax 2 y 2 



+ eow(ax 2 y 3 x + fiz A y) sin&ir 



sin &) f 



[6ax yx - 6axy y + 4/3z y\ + eo<u(ax y x + pz y) sinwr 

y smwt 



Hod) 

eoojax^y" + 



6ax2y WUp;z 4 + 4/3z3 6axy2 



Hod) 



Mow Mow 
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1 .6 Polarization of Materials 




Figure 1.7 A polarized medium 
withV-P = 0. 



We are essentially finished with our analysis of Maxwell's equations except for a 
brief discussion of current density J and charge density p. The current density 
can be decomposed into three general categories. First, as you might expect, 
currents can arise from free charges in motion such as electrons in a metal. We 
denote this type of current as Jf ree - Second, individual atoms can exhibit internal 
currents that give rise to paramagnetic and diamagnetic effects, denoted by J m . 
These are seldom important in optics problems, and so we will ignore these types 
of currents. Third, molecules in a material can elongate and become dipoles in 
response to an applied electric field. We denote this type of current, which arises 
from the polarization of the medium, by J p . 

The polarization current J p is associated with a dipole distribution function 
P, called the polarization (in units of dipoles per volume, or charge times length 
per volume). Physically, if the dipoles (depicted in Fig. 1.7) change their strength 
or orientation as a function of time in some coordinated fashion, an effective 
current density arises in the medium. Since the time -derivative of an individual 
dipole moment renders charge multiplied by velocity, the time -derivative of a 
distribution of 'sloshing' dipoles gives a current density equal to 

dP 

J P = ^- (1.28) 
at 

We thus write the total current in an optical medium (ignoring magnetic effects) 
as 

dP 

J = Jfree + 



dt 



(1.29) 



Now let's turn our attention to charge density p. We seldom consider the 
propagation of electromagnetic waveforms through electrically charged materials. 
We therefore will write pf ree = 0. One might be tempted in this case to set the 
overall charge density p to zero, but this would be wrong. The polarization of a 
neutral material, described by P, can vary spatially, leading to local concentrations 
of positive or negative charges. 

We let p p denote the charge density created by variations in the polarization 
P(r). To determine an expression for p p , we write the continuity equation (1.23) 
as applied to the currents and charges associated with this polarization: 

V-J — 
VJp " dt 

Substitution of (1.28) into this equation immediately yields 



(1.30) 



P P = -V-P 



(1.31) 



To better appreciate local charge buildup due to variation in the medium 
polarization, consider the divergence theorem (0.11) applied to P (r): 



-^"p(r)-n da 

s 



v 



(1.32) 
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The left-hand side of (1.32) is a surface integral, which after integrating gives 
units of charge. Physically, it is the sum of the charges touching the inside of 
surface S (multiplied by a minus since by convention dipole vectors point from 
the negatively charged end of a molecule to the positively charged end). When 
V • P is zero, there are equal numbers of positive and negative charges touching 
S from within, as depicted in Fig. 1.7. When V • P is not zero, the positive and 
negative charges touching S are not balanced, as depicted in Fig. 1.8. Essentially, 
excess charge ends up within the volume because the non-uniform alignment of 
dipoles causes them to be cut preferentially at the surface. 11 

Since we will ignore free charges (for optical media), we write the charge 
density according to (1.31) as 

p = -V-P (1.33) 

In summary, in electrically neutral non-magnetic media, Maxwell's equations 
(in terms of the medium polarization P) are 12 



V-E = — 
V B = 

VXE: 



VP 



Vx 



B 

Mo 



dB 

~~dt 

dE dP 

e °Tt + Tt 



+ Jfr, 



(Gauss's law) 


(1 


.34) 


(Gauss's law for magnetism) 


(1 


.35) 


(Faraday's law) 


(1 


.36) 


(Ampere's law; fixed by Maxwell) 


(1 


.37) 




Figure 1.8 A polarized medium 
with V-P^O. 



1 .7 The Wave Equation 

When Maxwell unified electromagnetic theory, he immediately noticed that waves 
are solutions to this set of equations. In fact his desire to find a set of equations 
that allowed for waves aided his effort to find the correct equations. After all, it 
was already known that light traveled as waves. Kirchhoff had previously noticed 
that 1 / ^/fo/io gives the correct speed of light c = 3 x 10 8 m/s (which had previously 
been measured) . Faraday and Kerr had observed that strong magnetic and electric 
fields affect light propagating in crystals. The time was right to suspect that light 
was an electromagnetic phenomena at high frequency. 

11 The figures may give you the impression that you could always just draw a surface that avoids 
cutting any dipoles. However, the function P (r) is continuous, while the figures depict crudely just 
a few dipoles. In a continuous material you can't draw a surface that avoids cutting dipoles. 

12 It is not uncommon to see the macroscopic Maxwell equations written in terms of two auxiliary 
fields: H and D. The field H is useful in magnetic materials. In these materials, the combination 
B//io in Ampere's law is replaced by H = B//i - M, where J m = V x M is the current associated 
with the material's magnetization. Since we only consider nonmagnetic materials (M = 0), there 
is little point in using H. The field D, called the displacement, is defined as D = e E + P. This 
combination of E and P occurs in Coulomb's law and Ampere's law. For the purposes of this book, 
it is conceptually more clear to retain the polarization P as a separate field in these two equations. 
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At first glance, Maxwell's equations might not immediately suggest (to the 
inexperienced eye) that waves are solutions. However, we can manipulate the 
equations (first order differential equations that couple E to B) into the familiar 
wave equation (decoupled second order differential equations for either E or B) . 
You should become familiar with this derivation. In what follows, we will derive 
the wave equation for E. The derivation of the wave equation for B is very similar 
(see problem PI. 6). 

Derivation of the Wave Equation 

Taking the curl of (1.3) gives 

d 

Vx(VxE) + — (VxB) = (1.38) 
at 

We may eliminate V x B by substitution from (1.4), which gives 



d z E dj 

Vx(VxE) + Li e — T = -Ho lr (1-39) 
of z at 



Next we apply the differential vector identity (0.10), V x (V x E) = V (V • E) - V 2 E, 
and use Gauss' law (1.1) to replace the term V • E, which brings us to 

, d 2 E d) Vp 

VE -^dfi=^di + ^ (L40) 

Substitution from (1.29) and (1.33) gives the more-useful-for-optics form 

, d 2 E dj fre e 3 2 P 1 

V 2 E-» e - 2 ^ Q — + Mo-^--V(V.P) (1.41) 

The left-hand side of (1.41) is the familiar wave equation. However, the right- 
hand side contains a number of source terms, which arise when various currents 
and/or polarizations are present. The first term on the right-hand side of (1.41) 
describes currents of free charges, which are important for determining the reflec- 
tion of light from a metallic surface or for determining the propagation of light in 
a plasma. The second term on the right-hand side describes dipole oscillations, 
which behave similar to currents. The final term on the right-hand side of (1.41) 
is important in anisotropic media such as crystals. In this case, the polarization 
P responds to the electric field along a direction not necessarily parallel to E, 
due to the influence of the crystal lattice (addressed in chapter 5) . In summary, 
when light propagates in a material, at least one of the terms on the right-hand 
side of (1.41) will be non zero. As an example, in glass, Jf ree = and V • P = 0, but 
d 2 V/dt 2 ^0 since the medium polarization responds to the light field, giving rise 
to refractive index (discussed in chapter 2). 

| Example 1.6 
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Show that the electric field 

E= (ax 2 y 3 x + /3z i y)coscot 

and the associated charge density (see Example 1.1) 

p = 2eo axy 3 cos co t 

together with the associated current density (see Example 1.5) 

[7 n o 6ax 2 y\ j , 4/3 z 3 6axy 2 
J = \e (Dax y + \x+ eowpz + IV 

satisfy the wave equation (1.40). 

Solution: We have 



smwf 



V 2 E- n e —^- - [a(2y 3 + 6x 2 y)x+ 12 pz 2 y] coswt 
+ HqCqOJ 2 {ax 2 y 3 x + pz 4 y) cosw? 

= [a[2y 3 + 6x 2 y + /j. e a) 2 x 2 y 3 )x + f3[l2z 2 + [i e a) 2 z 4 )y] coswt 

Similarly, 

Ho— H — — = \[HQ£Q(a 2 ax 2 y 3 + 6ax 2 y)x+ (/ioCow 2 j6z 4 + I2z 2 - 6axy 2 ) y] coswf 

Ot £q 

+ [2ay 3 x + 6axy 2 y] coswf 

= [a (/ioeo« 2 x 2 y 3 + 6x 2 y + 2y 3 ) x + (/ioeo<u 2 /3z 4 + 12z 2 ) j>] coswr 



The two expressions are equivalent, and the wave equation is satisfied 



13 



The magnetic field B satisfies a similar wave equation, decoupled from E (see 
PI. 6). However, the two waves are not independent. The fields for E and B must 
be chosen to be consistent with each other through Maxwell's equations. After 
solving the wave equation (1.41) for E, one can obtain the consistent B from E via 
Faraday's law (1.36). 

In vacuum all of the terms on the right-hand side in (1.41) are zero, in which 
case the wave equation reduces to 



V 2 E-p c ^2 =0 (vacuum) (1.42) 

Solutions to this equation can take on every imaginable functional shape (speci- 
fied at a given instant — the evolution thereafter being controlled by (1.42)). More- 
over, since the differential equation is linear, any number of solutions can be 
added together to create other valid solutions. Consider the subclass of solutions 



d 2 E 



13 The expressions in Example 1.6 hardly look like waves. The (quite unlikely) current and charge 
distributions, which fill all space, would have to be artificially induced rather than arise naturally in 
response to a field disturbance on a medium. 
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that propagate in a particular direction. These waveforms preserve shape while 
traveling with speed 

c = 1/v/eiM =2.9979 x 10 8 m/s (1.43) 

In this case, E depends on the argument ur - c t , where u is a unit vector specifying 
the direction of propagation. The shape is preserved since features occurring at a 
given position recur 'downstream' at a distance ct after a time t. By checking this 
solution in (1.42), one confirms that the speed of propagation is c (see PI. 8). As 
mentioned previously one may add together any combination of solutions (even 
with differing directions of propagation) to form other valid solutions. 
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Exercises for 1.1 Gauss' Law 

Pl.l Consider an infinitely long hollow cylinder with inner radius a and 
outer radius b as shown in Fig. 1.9. Assume that the cylinder has a 
charge density p = kl s 2 for a< s < b and no charge elsewhere, where 5 
is the radial distance from the axis of the cylinder. Use Gauss's Law in 
integral form to find the electric field produced by this charge for each 
of the three regions: s < a, a < s <b, and s> b. 

HINT: For each region first draw an appropriate 'Gaussian surface' and 
integrate the charge density over the volume to figure out the enclosed 
charge. Then use Gauss's law in integral form and the symmetry of the 
problem to solve for the electric field. 

Exercises for 1.3 Faraday's Law 

P1.2 Suppose that an electric field is given by E(r, t) = E cos(k-r- tot + (p), 
where k_LE and <p is a constant phase. Show that 

kxE . , 
B(r, t) = cos k-r-aif + d>) 

CJ 

is consistent with (1.3). 




Figure 1.9 A charged cylinder with 
charge located between a and b. 



Exercises for 1.4 Ampere's Law 

P1.3 A conducting cylinder with the same geometry as PI. 1 carries a current 
density J = kl sz along the axis of the cylinder for a< s <b, where s is 
the radial distance from the axis of the cylinder. Using Ampere's Law in 
integral form, find the magnetic field due to this current. Find the field 
for each of the three regions: s < a, a < s < b, and s> b. 

HINT: For each region first draw an appropriate Amperian loop' and 
integrate the current density over the surface to figure out how much 
current passes through the loop. Then use Ampere's law in integral 
form and the symmetry of the problem to solve for the magnetic field. 



Exercises for 1.6 Polarization of Materials 

P1.4 Memorize Maxwell's equations (1.1)-(1.4) together with (1.29) and 
(1.33). Be prepared to reproduce them from memory on an exam, 
and write them on your homework from memory to indicate comple- 
tion. Also very briefly summarize the physical principles described by 
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each of Maxwell's equations, and the assumptions that go into writing 
(1.29) and (1.33). 

P1.5 Check that the E and B fields in PI. 2, satisfy the rest of Maxwell's equa- 
tions (1.1), (1.2), and (1.4). What are the implications for J and p? 

Exercises for 1.7 The Wave Equation 

P1.6 Derive the wave equation for the magnetic field B in vacuum (i.e. J = 
and p = 0). 

P1.7 Show that the magnetic field in P1.2 is consistent with the wave equa- 
tion derived in PI. 6. 

P1.8 Verify that E(u-r- ct) satisfies the vacuum wave equation (1.42), where 
E has an arbitrary functional form. 

(a) Show that E (r, t) = E cos [k{u -r- ct) + (f>) is a solution to the vac- 
uum wave equation (1.42), where u is an arbitrary unit vector and k is 
a constant with units of inverse length. 

(b) Show that each wave front forms a plane, which is why such solu- 
tions are often called 'plane waves'. HINT: A wavefront is a surface in 
space where the argument of the cosine (i.e. the phase of the wave) has 
a constant value. Set the cosine argument to an arbitrary constant and 
see what positions are associated with that phase. 

(c) Determine the speed v - Ar/Af that a wave front moves in the u 
direction. HINT: Set the cosine argument to a constant, solve for r, and 
differentiate r with respect to t . 

(d) By analysis, determine the wavelength A in terms of k. HINT: Find 
the distance between identical wave fronts by changing the cosine 
argument by 2n at a given instant in time. 

(e) Use (1.34) to show that E and u must be perpendicular to each 
other in vacuum. 

Measure the speed of light using a rotating mirror. Provide an estimate 
of the experimental uncertainty in your answer (not the percentage 
error from the known value) . (video) 

Figure 1.10 shows a simplified geometry for the optical path for light 
in this experiment. Laser light from A reflects from a rotating mirror 
at B towards C. The light returns to B, where the mirror has rotated, 
sending the light to point D. Notice that a mirror rotation of 8 deflects 
the beam by 28. 



P1.9 




L1.10 



Figure 1.10 Geometry for lab 1.10. 
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Figure 1.11 A schematic of the setup for lab 1.10. 



Pl.l 1 Ole Roemer made the first successful measurement of the speed of light 
in 1676 by observing the orbital period of Io, a moon of Jupiter with a 
period of 42.5 hours. When Earth is moving toward Jupiter, the period 
is measured to be shorter than 42.5 hours because light indicating the 
end of the moon's orbit travels less distance than light indicating the 
beginning. When Earth is moving away from Jupiter, the situation is 
reversed, and the period is measured to be longer than 42.5 hours. 

(a) If you were to measure the time for 40 observed orbits of Io when 
Earth is moving directly toward Jupiter and then several months later 
measure the time for 40 observed orbits when Earth is moving directly 
away from Jupiter, what would you expect the difference between these 
two measurements be? Take the Earth's orbital radius to be 1.5 x 10 11 m. 
To simplify the geometry, just assume that Earth moves directly toward 
or away from Jupiter over the entire 40 orbits (see Fig. 1.12). 

(b) Roemer did the experiment described in part (a), and experimen- 
tally measured a 22 minute difference. What speed of light would one 
deduce from that value? 

PI. 12 In an isotropic medium (i.e. V • P = 0), the polarization can often be 
written as function of the electric field: P = e x{E)E, where x(E) = 
Xi + X2E + X$E 2 " ' • The higher order coefficients in the expansion (i.e. 
X2> %3> — ) are typically small, so only the first term is important at low 
intensities. The field of nonlinear optics deals with intense light-matter 
interactions, where the higher order terms of the expansion become 
important. This can lead to phenomena such as harmonic generation. 

Starting with Maxwell's equations, derive the wave equation for nonlin- 
ear optics in an isotropic medium: 



V 2 E- 



Mo e o(l + Ii)^-2=Moeo 



d 2 [ X2 E + X3E 2 + ---)E 



+ Mo- 



di 



of z or at 

We retain the possibility of current here since, for example, in a gas 
some of the molecules might ionize in the presence of a strong field, 
giving rise to currents. 



Ole Roemer (1644-1710, Danish) was 
a man of many interests. In addition to 
measuring the speed of light, he created 
a temperature scale which with slight 
modification became the Fahrenheit 
scale, introduced a system of standard 
weights and measures, and was heavily 
involved in civic affairs (city planning, 
etc.). Scientists initially became inter- 
ested in lo's orbit because its eclipse 
(when it went behind Jupiter) was an 
event that could be seen from many 
places on earth. By comparing accurate 
measurements of the local time when Io 
was eclipsed by Jupiter at two remote 
places on earth, scientists in the 1600s 
were able to determine the longitude 
difference between the two places. 



Earth 



Sun 



Io 

Jupiter 



Earth 

Figure 1.12 Geometry f or P 1 . 1 1 



Chapter 2 

Plane Waves and Refractive Index 



Now we turn our focus to sinusoidal solutions of Maxwell's equations, called 
plane waves. Restricting our attention to plane waves may seem limiting at first, 
since (as mentioned in chapter 1) any waveform shape that travels in a particular 
direction can satisfy the wave equation in vacuum, as long as it moves at the 
speed c and has the requisite connections between E and B. It turns out, however, 
that an arbitrary waveform can always be constructed from a linear superposition 
of sinusoidal waves. Thus, there is no loss of generality if we focus our attention 
on plane-wave solutions. 

In a material, the electric field of a plane wave induces oscillating dipoles, 
and these oscillating dipoles in turn alter the electric field. We use the index of 
refraction to describe this effect. Plane waves of different frequencies experience 
different refractive indices, which causes them to travel at different speeds in 
materials. Thus, an arbitrary waveform, which is composed of multiple sinusoidal 
waves, invariably changes shape as it travels in a material, as the different sinu- 
soidal waves change relationship with respect to one another. This dispersion 
phenomenon is a primary reason why physicists and engineers choose to work 
with sinusoidal waves. Every waveform except for individual sinusoidal waves 
changes shape as it travels in a material. 

When describing plane waves, it is convenient to employ complex numbers 
to represent physical quantities. This is particularly true for problems involving 
absorption, which takes place in metals and, to a lesser degree (usually), in di- 
electrics (e.g. glass). When the electric field is represented using complex notation, 
the index of refraction also becomes a complex number. You should make sure 
you are comfortable with the material in section 0.2 before proceeding. 



2. 1 Plane Wave Solutions to the Wave Equation 

Consider the wave equation for an electric field waveform propagating in vacuum 
(1.42): 

<3^E 

V 2 E- /ioc— =0 (2.1) 
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We are interested in solutions to (2.1) that have the functional form (see P1.9) 
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Figure 2.1 The electromagnetic 
spectrum 



E(r, t) = E cos (k- r -<y £ + (/>) 



(2.2) 



Here <p represents an arbitrary (constant) phase term. The vector k, called the 
wave vector, may be written as 



ksfcu= - — u 



(vacuum) (2.3) 



where has units of inverse length, u is a unit vector defining the direction of 
propagation, and A vac is the length by which r must vary (in the direction of u) to 
cause the cosine to go through a complete cycle. This distance is known as the 
(vacuum) wavelength. The frequency of oscillation is related to the wavelength via 



CO - 



2nc 

Avar 



(vacuum) (2.4) 



The frequency co has units of radians per second. Frequency is also often ex- 
pressed as v = co/2n in units of inverse seconds or Hz. Notice that k and co cannot 
be chosen independently; the wave equation requires them to be related through 
the dispersion relation 

k = — (vacuum) (2.5) 



CO 

c 



Typical values for A va c are given in Fig. 2.1. Sometimes the spatial period of the 
wave is expressed as 1/A va c» in units of cm -1 , called the wave number. 

A magnetic wave accompanies any electric wave, and it obeys a similar wave 
equation (see PI. 6). The magnetic wave corresponding to (2.2) is 



B(r, t) = B cos(k-r-wf + </>), 



(2.6) 



It is important to note that B , k, co, and (p are not independently chosen in (2.6). 
In order to satisfy Faraday's law (1.3), the arguments of the cosine in (2.2) and 
(2.6) must be identical. Therefore, in vacuum the electric and magnetic fields 
travel in phase. In addition, Faraday's law requires (see P1.2) 



kxE 



CO 



(2.7) 



The above cross product means that B , is perpendicular to both E and k. Mean- 
while, Gauss' law V • E = forces k to be perpendicular to E . It follows that the 
magnitudes of the fields are related through B = kE /co or B = E /c, in view of 
(2.5). 

The influence of the magnetic field only becomes important (in comparison 
to the electric field) for charged particles moving near the speed of light. This 
typically takes place only for extremely intense lasers (> 10 18 W/cm 2 , see P2.12) 
where the electric field is sufficiently strong to cause electrons to oscillate with 
velocities near the speed of light. We will be interested in optics problems that take 
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place at far less intensity where the effects of the magnetic field can typically be 
safely ignored. Throughout the remainder of this book, we will focus our attention 
mainly on the electric field with the understanding that we can at any time deduce 
the (less important) magnetic field from the electric field via Faraday's law. 

Figure 2.2 depicts the electric field (2.2) and the associated magnetic field 
(2.6) like transverse waves on a string. However, they are actually large planar 
sheets of uniform field strengths (difficult to draw) that move in the direction of k. 
The name plane wave is given since a constant argument in (2.2) at any moment 
describes a plane, which is perpendicular to k. A plane wave fills all space and 
may be thought of as a series of infinite sheets, each with a different uniform field 
strength, moving in the k direction. 

At this point, we rewrite our plane wave solution using complex number nota- 
tion. Although this change in notation will not make the task at hand any easier 
(and may even appear to complicate things), we introduce it here in preparation 
for later sections, where it will save considerable labor. (For a review of complex 
notation, see section 0.2.) 

Using complex notation we rewrite (2.2) as 

E(r,£) = Re{E e Uk - r - wf) } (2.8) 

where we have hidden the phase term <p inside of E as follows: 1 

E = E e ! ^ (2.9) 

The next step we take is to become intentionally sloppy. Physicists throughout 
the world have conspired to avoid writing Re { } in an effort (or lack thereof if 
you prefer) to make expressions less cluttered. Nevertheless, only the real part of 
the field is physically relevant even though expressions and calculations contain 
both real and imaginary terms. This sloppy notation is okay since the real and 
imaginary parts of complex numbers never intermingle when adding, subtracting, 
differentiating, or integrating. We can delay taking the real part of the expression 
until the end of the calculation. Also, when hiding a phase <p inside of the field 
amplitude as in (2.8), we drop the tilde (might as well since we are already being 
sloppy); we will automatically assume that the field amplitude is complex and 
contains phase information. Putting this all together, our plane wave solution in 
complex notation is written simply as 

E(r,f) =E e i ' (k ' r_wf) (2.10) 

It is possible to construct any electromagnetic disturbance from a linear superpo- 
sition of such waves, which we will do in chapter 7. 



We have assumed that each vector component of the field propagates with the same phase. To 
be more general, one could write E = xE ax e''l >x + yE^e 1 ^ + zE az e i( l' z . 




Figure 2.2 Depiction of electric 
and magnetic fields associated 
with a plane wave. 
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Example 2.1 

Verify that the complex plane wave (2.10) is a solution to the wave equation (2.1). 
Solution: The first term gives 



V 2 E e ! ' tk "- wfl =E 



dx 2 dy 2 dz 2 



y i{k x x+k y y+k z z—d}t) 



ait) 



i (k-r-oi t) 



(2.11) 



and the second term gives 
1 d 2 



(2.12) 



Upon insertion into (2.1) we obtain the vacuum dispersion relation (2.5), which 
specifies the connection between the wavenumber k and the frequency to, empha- 
sizing that k and w cannot be chosen independently. 



2.2 Index of Refraction 

Now let's examine how plane waves behave in dielectric media (e.g. glass). We 
assume an isotropic, 2 homogeneous? and non-conducting medium (i.e. Jf ree = 0). 
In this case, we expect E and P to be parallel to each other so V - P = from (1.34). 4 
The general wave equation (1.41) for the electric field reduces in this case to 

P d 2 E d 2 P 

V 2 E-e ^o^- 2 - = Mo^- 2 - (2.13) 

Since we are considering sinusoidal waves, we consider solutions of the form 

E = E e i ' (k - r - wf) 

P = P e , ' (k - r - <,u) (2 ' 14) 

By writing this, we are making the (reasonable) assumption that if an electric 
field stimulates a medium at frequency a), then the polarization in the medium 
also oscillates at frequency a). This assumption is typically rather good except 
for extreme electric fields, which can generate frequency harmonics through 
nonlinear effects (see PI. 12). Recall that by our prior agreement, the complex 
amplitudes of E and P carry phase information. Thus, while E and P in (2.14) 
oscillate at the same frequency, they can be out of phase with respect to each 

isotropic means the material behaves the same for propagation in any direction. Many crystals 
are not isotropic as we'll see in Chapter 5. 

3 Homogeneous means the material is everywhere the same throughout space. 
4 This follows for a wave of the form (2.14) if P and k are perpendicular. 
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other. This phase discrepancy is most pronounced for materials that absorb 
energy at the plane wave frequency 

Substitution of the trial solutions (2.14) into (2.13) yields 

- jt 2 E e i ' (k - r - wt) +e ^a) 2 E e i{k - r - (Ot) = -p^V^-^ (2.15) 

To go further, we need to make an explicit connection between E and P (exter- 
nal to Maxwell's equations). In a linear medium, the polarization amplitude is 
proportional to the strength of the applied electric field: 

P„M=e lME M (2.16) 

This is known as a constitutive relation. We have introduced a dimensionless pro- 
portionality factor j(<y) called the susceptibility, which depends on the frequency 
of the field. We account for the possibility that E and P oscillate out of phase by 
allowing j(to) to be a complex number. 

By inserting (2.16) into (2.15) and canceling the field terms, we obtain the 
dispersion relation in dielectrics: 

fc 2 =c Mo[l+lM]w 2 or k=^y/l+x((0) (2.17) 

where we have used c = l/^/£- ju . In general, j(w) is a complex number, which 
leads to a complex index of refraction, defined by 5 

JV(to) = n{(D) + iK(to) = v/l+^M (2.18) 

where n and k are respectively the real and imaginary parts of the index. (Note 
that k is not k.) According to (2.17), the magnitude of the wave vector is also 
complex according to 

k= jra = {n + iK)» (219) 
c c 

The use of complex index of refraction only makes sense in the context of complex 
representation of plane waves. 

The complex index jV takes into account absorption as well as the usual 
oscillatory behavior of the wave. We see this by explicitly placing (2.19) into 
(2.14): 

E(r, t) = E oe - Im *>-V' (Re{k} - r - wf) = E e-^ fl - r e ; ^ fl - r - wf ) (2.20) 

As before, u is a real unit vector specifying the direction of k. Again, when looking 
at (2.20), by special agreement in advance, we should just think of the real part, 
namely 6 

E(r, t) = E e ~ ur cos[— u-r-aif + (/>] (2.21) 

5 Electrodynamics books often use the electric displacement D = e E + P = eE. See M. Born and 
E. Wolf, Principles of Optics, 7th ed., p. 3 (Cambridge University Press, 1999). The permittivity 
e encapsulates the constitutive relation that connects P with E. In a linear medium we have 
e = e (l + %), so that the index of refraction is given by jY = \/ele . 

6 For the sake of simplicity in writing (2.21) we assume linearly polarized light. That is, all vector 
components of E have the same complex phase (p. We will consider other possibilities, such as 
circularly polarized light, in chapter 6. 
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10it 
kz 



20-k 



Figure 2.3 Electric field of a decay- 
ing plane wave. For convenience 
in plotting, the direction of prop- 
agation is chosen to be in the z 
direction (i.e. u = z). 



where an overall phase (p was formerly held in the complex vector E . (The tilde 
had been suppressed.) Figure 2.3 shows a graph of (2.21). The imaginary part of 
the index k causes the wave to decay as it travels. The real part of the index n is 
associated with the oscillations of the wave. By inspection of the cosine argument 
in (eq:2.3.20), we see that the speed of the (diminishing) sinusoidal wave fronts is 



^phaseM = Cln{(D) 



(2.22) 



It is apparent that n{a>) is the ratio of the speed of the light in vacuum to the speed 
of the wave in the material. 

In a dielectric, the vacuum relations (2.3) and (2.4) are modified to read 



where 



2n 

Re{k}=— u, 

A 



A = Avac/n- 



(2.23) 



(2.24) 



While the frequency a) is the same, whether in a material or in vacuum, the 
wavelength A varies with the real part of the index n. 

Example 2.2 

When n = 1.5, k = 0.1, and v = 5 x 10 14 Hz, find (a) the wavelength inside the 
material, and (b) the propagation distance over which the amplitude of the wave 
diminishes by the factor e _1 (called the skin depth). 



Solution: (a) 



(b) 



A = 



l vac 

n 



2nc c (3 x 10 8 m/s) 
no) ~ nv~ 1.5(5 x 10 14 Hz) 



= 400 nm 



3 x 10 B m/s 



kcd 2jikv In (0. 1) (5 x 10 14 Hz) 



= 950 nm 



Obtaining n and k from the complex susceptibility % 



From (2.18) we have 

[n + z'jc) 2 - n 2 -K 2 + ilriK- 1 + Re{x} + «'Im{x} = l+% 



(2.25) 



The real parts and the imaginary parts in the above equation are separately equal: 



n 2 - k 2 = 1 + Re {%} and 2mc = Im{;c} 



(2.26) 
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From the latter equation we have 

K = lm{x}/2n (2.27) 
When this is substituted into the first equation of (2.26) we get a quadratic in n 2 



= 



(2.28) 



The positive 7 real root to this equation is 



(1 + Re {%}) + ^{l + Re{x}) 2 + {lm{ X }f 



(2.29) 



The imaginary part of the index is then obtained from (2.27). 



When absorption is small we can neglect the imaginary part of x(&>), and 
(2.29) reduces to 



l {CD) = y/l+X{0J) 



(negligible absorption) (2.30) 



2.3 The Lorentz Model of Dielectrics 



To compute the index of refraction in either a dielectric or a conducting material, 
we require a model that describes the response of electrons in the material to 
the passing electric field wave. Of course, the model in turn influences how the 
electric field propagates, which is what influences the material in the first place! 
The model therefore must be solved together with the propagating field in a 
self-consistent manner. 

Hendrik Lorentz developed a very successful model in the late 1800s, which 
treats each (active) electron in the medium as a classical particle obeying Newton's 
second law (F = ma). In the case of a dielectric medium, electrons are subject to 
an elastic restoring force that keeps each electron bound to its respective atom 
and a damping force that dissipates energy and gives rise to absorption. 

The Lorentz model determines the susceptibility % {oj) (the connection be- 
tween the electric field E and the polarization P ) and hence the index of re- 
fraction. The model assumes that all atoms (or molecules) in the medium are 
identical, each with one (or a few) active electrons responding to the external 
field. The atoms are uniformly distributed throughout space with N identical 
active electrons per volume (units of number per volume). The polarization of 
the material is then 

P = iV^rmicro (2.31) 

Recall that polarization has units of dipoles per volume. Each dipole has strength 
q e r m i cro , where r m i cro is a microscopic displacement of the electron from equilib- 
rium. 



Hendrik Antoon Lorentz (1853-1928, 
Dutch) was born in Arnhem, Nether- 
lands, the son a successful nurseryman. 
Hendrick's mother died when he was 
four years old. He studied classical lan- 
guages and then entered the University 
of Leiden where he was strongly influ- 
enced by astronomy professor Frederik 
Kaiser, whose niece Hendrik married. 
Hendrik was persuaded to become a 
physicist and wrote a doctoral disserta- 
tion entitled "On the theory of reflection 
and refraction of light," in which he 
refined Maxwell's electromagnetic the- 
ory. Lorentz correctly hypothesized that 
the atoms were composed of charged 
particles, and that their movement was 
the source of light. He also derived the 
transformations of space and time, later 
used in Einstein's theory of relativity. 
Lorentz won the Nobel prize in 1902 
for his contributions to electromagnetic 
theory. (Wikipedia) 



7 It is possible to have n < for so called meta materials, not considered here. 
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Unperturbed 




At the time of Lorentz, atoms were thought to be clouds of positive charge 
wherein point-like electrons sat at rest unless stimulated by an applied electric 
field. In our modern quantum-mechanical viewpoint, r m j cro corresponds to an 
average displacement of the electronic cloud, which surrounds the nucleus (see 
Fig. 2.4). The displacement r m i cro of the electron charge in an individual atom 
depends on the local strength of the applied electric field E at the position of the 
atom. Since the diameter of the electronic cloud is tiny compared to a wavelength 
of (visible) light, we may consider the electric field to be uniform across any 
individual atom. 

The Lorentz model uses Newton's equation of motion to describe an electron 
displacement from equilibrium within an atom. In accordance with the classical 
laws of motion, the electron mass m e times its acceleration is equal to the sum of 
the forces on the electron: 



Wl e T m i cm — Cj e YL WJeyrmicro fcHookel"micro 



(2.32) 



In an electric field 




Figure 2.4 A distorted electronic 
cloud becomes a dipole. 



The electric field pulls on the electron with force <7 e E. 8 A drag force (or friction) 
-m e yr m j cro opposes the electron motion and accounts for absorption of energy. 
Without this term, it is only possible to describe optical index at frequencies away 
from where absorption takes place. Finally, -fcHookermicro * s a f° rce accounting 
for the fact that the electron is bound to the nucleus. This restoring force can be 
thought of as an effective spring that pulls the displaced electron back towards 
equilibrium with a force proportional to the amount of displacement, so this 
term is essentially the familiar Hooke's law. With some rearranging, (2.32) can be 
written as 

(2.33) 



rmicro + yi"micro + co r m j cro 



where a) = \J fcnooke / is the natural oscillation frequency (or resonant fre- 
quency) associated with the electron mass and the 'spring constant.' 

There is a subtle problem with our analysis, which we will continue to neglect 
in this section, but which should be mentioned. The field E in (2.32) is the net 
field, which is influenced by the presence of all of the dipoles. The actual field that 
a dipole 'feels', however, does not include its own field. That is, we should remove 
from E the field produced by each dipole in its own vicinity This significantly 
modifies the result if the density of the material is sufficiently high. This effect is 
described by the Clausius-Mossotti formula, which is treated in appendix 2.B. 

In accordance with our examination of a single sinusoidal wave, we insert 
(2.14) into (2.33) and obtain 



2 

i + ^o^micro : 



Qe v i(k-r-ojf) 



(2.34) 

Note that within a given atom the excursions of r m i cro are so small that kr remains 
essentially constant, since k-r varies with displacements on the scale of an optical 



The electron also experiences a force due to the magnetic field of the light, F = geV m j cro x B, 
but this force is tiny for typical optical fields. 
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wavelength, which is huge compared to the size of an atom. The inhomogeneous 
solution to (2.34) is (see P2.1) 



E e 



\ m e j o)q - itoj-to^ 



(2.35) 



The electron position r m i cro oscillates (not surprisingly) with the same frequency 
to as the driving electric field. This solution illustrates the convenience of com- 
plex notation. The imaginary part in the denominator implies that the electron 
oscillates with a phase different from the electric field oscillations; the damping 
term y (the imaginary part in the denominator) causes the two to be out of phase 
somewhat. The complex algebra in (2.35) accomplishes quite easily what would 
otherwise be cumbersome (i.e. working out a trigonometric phase). 

We are now able to write the polarization in terms of the electric field. By 
substituting (2.35) into (2.31) and rearranging, we obtain 



P = e \ 



tOZ 



to\ - itoj - co 2 



E e 



!(k-r-tiir) 



(2.36) 



where the plasma frequency co p has been introduced: 9 



CDr 



Nq 2 e 



e n m P 



A comparison of (2.36) with (2.16) reveals the (complex) susceptibility: 

,2 



tot 



tor, 



ILOJ-CO^ 



(2.37) 



(2.38) 



The index of refraction is then found by substituting the susceptibility (2.38) into 
(2.18). The real and imaginary parts of the index are solved by equating separately 
the real and imaginary parts of (2.18), namely 



tot 



[n + ixr = 1 + J (to) = 1 + — 5- 

toi ■ 



itoj-to^ 



(2.39) 



A graph of n and k is given in Fig. 2.5. 

Most materials actually have more than one species of active electron, and 
different active electrons behave differently. The generalization of (2.39) in this 
case is 

fjto 2 , . 

(n + iK) 2 = l + xito) = l + Y,— ; — ~ (2-40) 



to 



ItOJj 



-to^ 



where fj is the aptly named oscillator strength for the j th species of active electron. 
Each species also has its own plasma frequency to P j, natural frequency to j, and 
damping coefficient jj. 



9 In a plasma, charges move freely so that both the Hooke restoring force and the dragging term 
can be neglected (i.e. a> = 0, 7 = 0). For a plasma, cjp is the dominant parameter. 




5 

to~<uo)/y 

Figure 2.5 Real and imaginary 
parts of the index for a single 
Lorentz oscillator dielectric with 
top = 10y. 
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-20 20 40 



{(O- (Op)/ y 

Figure 2.6 Real and imaginary 
parts of the index for conductor 
with Wp = 50y. 



Lorentz introduced this model well before the development of quantum 
mechanics. Even though the model pays no attention to quantum physics, it 
works surprisingly well for describing frequency-dependent optical indices and 
absorption of light. As it turns out, the Schrodinger equation applied to two levels 
in an atom reduces in mathematical form to the Lorentz model in the limit of 
low-intensity light. Quantum mechanics also explains the oscillator strength, 
which before the development of quantum mechanics had to be inserted ad hoc 
to make the model agree with experiments. The friction term j turns out not to be 
associated with something internal to atoms but rather with collisions between 
atoms, which on average give rise to the same behavior. 

2.4 Index of Refraction of a Conductor 

In a conducting medium, the outer electrons of atoms are free to move without 
being tethered to any particular atom. However, the electrons are still subject to a 
damping force due to collisions that remove energy and give rise to absorption. 
Such collisions are associated with resistance in a conductor. As it turns out, 
we can obtain a simple formula for the refractive index of a conductor from the 
Lorentz model in section 2.3. We simply remove the restoring force that binds 
electrons to their atoms. That is, we set to = in (2.39), which gives 

2 W P 

{n+iK) 2 = l-- (2.41) 

l(DJ + 0)* 

This underscores the fact that dVldt is a current very much like Jf ree . When 
we remove the restoring force A; Hoo ke = ^e^>\ from the atomic model, the elec- 
trons effectively become free, and it is not surprising that they exactly mimic the 
behavior of a free current Jf ree . A graph of n and k in the conductor model is given 
in Fig. 2.6. Below, we provide the derivation for (2.41) in the context of Jf ree rather 
than as a limiting case of the dielectric model. 10 



Derivation of Refractive Index for a Conductor 

We will include the current density If ree while setting the medium polarization P 
to zero. The wave equation is 

P d 2 d 

V^E-Co^o^E^o— J free (2.42) 
at at 

We assume that the current is made up of individual electrons traveling with 
velocity v micro : 

Ifree = Nq e v mici0 (2.43) 

As before, N is the number density of free electrons (in units of number per vol- 
ume) . Recall that current density If ree has units of charge times velocity per volume 

*G. Burns, Solid State Physics, Sect. 9-5 (Orlando: Academic Press, 1985). 
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(or current per cross sectional area), so (2.43) may be thought of as a definition of 
current density in a fundamental sense. 

Again, the electrons satisfy Newton's equation of motion, similar to (2.32) except 
without a restoring force: 



m e r micro = q e V ~ m e jr miao (2.44) 
For a sinusoidal electric field E = E e l( - k ' r ~ a)t \ the solution to this equation is 

q e \ E e''*- r - <ur) 



"micro — 'micro 



111, 



Id) 



(2.45) 



where again we assume that the electron oscillation excursions described by r m ; cr0 
are small compared to the wavelength so that r can be treated as a constant in 
(2.44). The current density (2.43) in terms of the electric field is then 



Jfree : 



y- id) 



(2.46) 



We substitute this together with the electric field into the wave equation (2.42) and 
get 



. k 2 E /CkT-Olfl + 



CO 



? -E e 



i'(k-r-ait) 



-ICO 



This simplifies down to the dispersion relation 



y- it) 



co 



10" 



1 



ijco + to 1 



(2.47) 



(2.48) 



which agrees with (2.41). We have made the substitution top = Nq 2 e le a m e in accor- 

i2 _ (A±+X) = o/(n+iK) 2 ^ SQ the suscep tibility and the 



dance with (2.37). As usual, k 
index may be extracted from (2.48). 



Note that in the low- frequency limit (i.e. co « y), the current density (2.46) 
reduces to Ohm's law J = crE, where a = Nq% I m e y is the DC conductivity. In 
the high-frequency limit (i.e. to » y), the behavior changes over to that of a 
free plasma, where collisions, which are responsible for resistance, become less 
important since the excursions of the electrons during oscillations become very 
small. This formula captures the general behavior of metals, but actual values of 
the index vary from this somewhat (see P2.6 ). 

In either the conductor or dielectric model, the damping term removes energy 
from electron oscillations. The damping term gives rise to an imaginary part 
of the index, which causes an exponential attenuation of the plane wave as it 
propagates. 

2.5 Poynting's Theorem 



G 



- 




- 







J. 



Figure 2.7 The electrons in a 
conductor can easily move in 
response to the applied field. 



Until now, we have described light as the propagation of an electromagnetic 
disturbance. However, we typically observe light by detecting absorbed energy 
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John Henry Poynting (1852-1914, 
English) was the youngest son of a Uni- 
tarian minister who operated a school 
near Manchester England where John 
received his childhood education. He 
later attended Owen's College in Manch- 
ester and then went on to Cambridge 
University where he distinguished him- 
self in mathematics and worked under 
James Maxwell in the Cavendish Lab- 
oratory. Poynting joined the faculty of 
the University of Birmingham (then 
called Mason Science College) where 
he was a professor of physics from 1880 
until his death. Besides developing his 
famous theorem on the conservation 
of energy in electromagnetic fields, he 
performed innovative measurements of 
Newton's gravitational constant and 
discovered that the Sun's radiation 
draws in small particles towards it, the 
Poynting-Robertson effect. Poynting 
was the principal author of a multi- 
volume undergraduate physics textbook, 
which was in wide use until the 1930s. 
(Wikipedia) 



rather than the field amplitude directly. In this section we examine the connection 
between propagating electromagnetic fields (such as the plane waves discussed 
in this chapter) and the energy transported by such fields. 

In the late 1800s John Poynting developed (from Maxwell's equations) the the- 
oretical foundation that describes light energy transport. You should appreciate 
and remember the ideas involved, especially the definition and meaning of the 
Poynting vector, even if you forget the specifics of its derivation. 

Derivation of Poynting's Theorem 

We require just two of Maxwell's Equations: (1.3) and (1.4). We take the dot product 
of B//i with the first equation and the dot product of E with the second equation. 
Then by subtracting the second equation from the first we obtain 



B ( B\ dE B dB 

— -(VxE)-E- Vx — +c E + 

\ Vol dt Ho dt 



EI 



(2.49) 



The first two terms can be simplified using the vector identity P0.8. The next two 
terms are the time derivatives of e E 2 /2 and B 2 /2[i , respectively. The relation 
(2.49) then becomes 



B \ d (e E z B z 

V-lEx— +— — + 

Ho) dt\ 2 2[i„ 



= -E J 



(2.50) 



This is Poynting's theorem. Each term in this equation has units of power per 
volume. 



It is conventional to write Poynting's theorem as follows: 11 



V • S + — (Wfield + "medium) = 

at 



where 



SsEx 



B 



(2.51) 



(2.52) 



is called the Poynting vector, which has units of power per area, called irradiance. 
The expression 

e E 2 B 2 



"field = 



2 + 2 j u 



(2.53) 



is the energy per volume stored in the electric and magnetic fields. Derivations of 
the electric field energy density and the magnetic field energy density are given in 
Appendices 2.C and 2.D. (See (2.79) and (2.86).) The derivative 



"medium 

dt 



= E J 



(2.54) 



11 See D.J. Griffiths, Introduction to Electrodynamics, 3rd ed., Sect. 8.1.2 (New Jersey: Prentice-Hall, 
1999). 
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describes the power per volume delivered to the medium from the field. Equa- 
tion (2.54) is reminiscent of the familiar circuit power law, Power = Voltage x 
Current. Power is delivered when a charged particle traverses a distance while 
experiencing a force. This happens when currents flow in the presence of electric 
fields. 

Poynting's theorem is essentially a statement of the conservation of energy, 
where S describes the flow of energy. To appreciate this, consider Poynting's 
theorem (2.51) integrated over a volume V (enclosed by surface S). If we also 
apply the divergence theorem (0.11) to the term involving V • S we obtain 

j)S-hda=- — J (W fie ld+ "medium) dv (2.55) 

S V 

Notice that the volume integral over energy densities Wfieid and ^medium gives 
the total energy stored in V, whether in the form of electromagnetic field energy 
density or as energy density that has been given to the medium. The integration 
of the Poynting vector over the surface gives the net Poynting vector flux directed 
outward. Equation (2.55) indicates that the outward Poynting vector flux matches 
the rate that total energy disappears from the interior of V. Conversely, if the 
Poynting vector is directed inward (negative), then the net inward flux matches 
the rate that energy increases within V. The vector S defines the flow of energy 
through space. Its units of power per area are just what is needed to describe the 
brightness of light impinging on a surface. 



Example 2.3 

(a) Find the Poynting vector S and energy density iigeid for the plane wave field E = 
±E cos {kz -cot) traveling in vacuum, (b) Check that S and Wfi e id satisfy Poynting's 
theorem. 

Solution: The associated magnetic field is (see PI. 2) 

zk x xJj kE 
B= cos {kz- cot) = y cos{kz- cot) 

CO CO 

(a) The Poynting vector is 

E x B kE 
S= xE cos {kz- cot) xy cos {kz- cot) 

Ll COHo 

= zccqEqCos 2 {kz -cot) 
where we have used co-kc and /i = 1 / (c 2 e ) • The energy density is 

€qE B CqE,* n kEii T 

"field- h - cos {kz-C0t) + ^COS {kz-COt) 

2 2^o 2 2/j, co z 
= CoEqCos 2 {kz-cot) 

Notice that S = cu. The energy density traveling at speed c gives rise to the power 
per area passing a surface (perpendicular to z). 
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(b) We have 

V • S = ce Eg — cos 2 {kz-cot) = -2kce Eg cos {kz- cot)sin(kz- cot) 
az 

whereas 

— — — = e Eg — cos {kz-cot) = 2coe Eg cos {kz- cot) sin(fcz- cot) 
Poynting's theorem (2.50) is satisfied since co - kc. 

It is common to replace the rapidly oscillating function cos 2 {kz -cot) with its time 
average 1/2, but this would have inhibited our ability to take the above derivatives. 



2.6 Irradiance of a Plane Wave 



Consider the electric plane-wave field E(r, t) = E e i0t ''~ <ot] . The magnetic field 
that accompanies this electric field can be found from Maxwell's equation (1.3), 
and it turns out to be (compare with problem PI. 2) 



to 



(2.56) 



When k is complex, B is out of phase with E, and this occurs when absorption 
takes place. When there is no absorption, then k is real, and B and E carry the 
same complex phase. 

Before computing the Poynting vector (2.52), which involves multiplication, 
we must remember our unspoken agreement that only the real parts of the fields 
are relevant. We necessarily remove the imaginary parts before multiplying (see 
(0.22)). To obtain the real parts of the fields, we add their respective complex 
conjugates and divide the result by 2 (see (0.30)). The real field associated with 
the plane-wave electric field is 



E(r, = i 



Eoe iO**-cot) +E * -/(k*-r- 



cot) 



and the real field associated with (2.56) is 



B(r, t) = 



k x E jft.r_.0fl k x E fft'.r-ftjf) 



to 



to 



(2.57) 



(2.58) 



We have merely exercised our previous (conspiratorial) agreement that only the 
real parts of (2.39) and (2.56) are to be retained. 

Now we are ready to calculate the Poynting vector. The algebra is a little messy 
in general, so we restrict the analysis to the case of an isotropic medium for the 
sake of simplicity. 
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Calculation of the Poynting Vector for a Plane Wave 

Using (2.57) and (2.56) in (2.52) gives 



B 

S = Ex — 

Mo 

= - IE,,*' 



2 
1 

1 

4Mo 



^'(k-r-iuf) _|_ g* e -i'(k*-r-fcJt)j 



kxE 



ft) 



k* x F* 
ft) 



Eox(kxEo) 2i(k-r-«j£) + E o* x ( kxE o) i(k-k*)r 

E x(k xE ) jfk-k'l-t | E o x l k xE o) -2i(k*-t-cot) 

-E x (u x Eo) e 2m - r - w » + ~E* x (u x E ) e - 2 ^ fl r + C.C. 
ft) ft) 



(2.59) 

The letters 'C.C stand for the complex conjugate of what precedes in the square 
brackets. The direction of k is specified with the real unit vector u. We have also 
used (2.19) to rewrite /(k-k*) as -2{kwI c)u. 

The assumption of an isotropic medium (not a crystal) means that V • E(r, t) — 
and therefore u • E = 0. We can use this fact together with the BAC-CAB rule P0.3 
to reduce the above expression to 



4Mo 



(Eo-E )e 



2i(k-r-a)t) 



+ -(E -E *)e- 2 ^ u - r + C.C. 
ft) 



(2.60) 



The final expression shows that (in an isotropic medium) the flow of energy is in 
the direction of u (or k). This agrees with our intuition that energy flows in the 
direction that the wave propagates. 



Very often, we are interested in the time-average of the Poynting vector, de- 
noted by <S> f . There are no electronics that can keep up with the rapid oscillation 
of visible light (i.e. > 10 14 Hz). Therefore, what is always measured is the time- 
averaged absorption of energy. Under time averaging, the first term in (2.60) 
vanishes since it rapidly oscillates positive and negative. The time-averaged 
Poynting vector (including the term C.C.) becomes 



(S> r = 



u k + k* 



4jU (D 

A ne c 
u 



-2^u-r 



(|£oxl 2 + |£o y | 2 + |£o/)e- 2! ? fl - r 



(2.61) 



We have used (2.19) to rewrite k + k* as 2{na)/c). We have also used (1.43) to 
rewrite l/fi c as Cq c. 

The expression (2.61) is formally called the irradiance (with the direction u 
included) . However, we often speak of the intensity of a field /, which amounts to 
the same thing, but without regard for the direction u. The definition of intensity 
is thus less specific, and it can be applied, for example, to standing waves where 
the net irradiance is technically zero (i.e. counter-propagating plane waves with 
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Radiant Power (of a source): Elec- 
tromagnetic energy. Units: W = J/s 

Radiant Solid-Angle Intensity 
(of a source): Radiant power per 
steradian emitted from a point- 
like source (in steradians in a 
sphere). Units: W/Sr 

Radiance or Brightness (of a 
source): Radiant solid-angle in- 
tensity per unit projected area of 
an extended source. The projected 
area foreshortens by cosfl, where 
is the observation angle rela- 
tive to the surface normal. Units: 
W/(Sr-cm 2 ) 

Radiant Emittance or Exitance 
(from a source): Radiant Power 
emitted per unit surface area of an 
extended source (the Poynting flux 
leaving). Units: W/cm 2 

Irradiance (to a receiver) Often 
called intensity: Electromagnetic 
power delivered per area to a 
receiver: Poynting flux arriving. 
Units: W/cm 2 



zero net energy flow). Nevertheless, atoms in standing waves 'feel' the oscillating 
field. In general, the intensity is written as 



Table 2.1 Radiometric quantities 
and units. 



Scotoptic 
t '~f- / ~ 1700W/lm@>507nm 




Photoptic 

683 W/lm <»555 nm 



400 500 600 700 800 
wavelength (nm) 

Figure 2.8 The response of a "stan- 
dard" human eye under relatively 
bright conditions (photoptic) and 
in dim conditions (scotoptic). 



ne c 



o - (l^oxl 2 + \ E o y \ 2 + \E 0z \ 2 ] 



(2.62) 



where in this case we have ignored absorption (i.e. k « 0). Alternatively, we 
could consider \E 0x \ 2 , l-Eoyl 2 , and |£ zl 2 to include the factor exp(-2(?ca>/c)u-r) 
so that they correspond to the local electric field. Equation (2.62) agrees with S in 
Example 2.3 where n = 1 and E = xE is real; the cosine squared averages to 1/2. 



Appendix 2.A Radiometry, Photometry, and Color 

The field of study that quantifies the energy in electromagnetic radiation (in- 
cluding visible light) is referred to as radiometry. Table 2.1 lists several concepts 
important in radiometry. The radiance at a detector and the exitance from a 
source are both direct measurements of the average Poynting flux, and the other 
quantities in the table are directly related to the Poynting flux through geometric 
factors. One of the challenges in radiometry is that light sensors always have a 
wavelength-dependent sensitivity to light, whereas the quantities in Table 2.1 
treat light of all wavelengths on equal footing. Disentangling the detector re- 
sponse from the desired signal in a radiometric measurement takes considerable 
care. 

Photometry refers to the characterization of light energy in the context of the 
response of the human eye. In contrast to radiometry, photometry takes great care 
to mimic the wavelength-dependent effects of the eye-brain detection system so 
that photometric quantities are an accurate reflection of our everyday experience 
with light. The concepts used in photometry are similar to those in radiometry, 
except that the radiometric quantities are multiplied by the spectral response of 
our eye-brain system. 

Our eyes contain two types of photoreceptors — rods and cones. The rods 
are very sensitive and provide virtually all of our vision in dim light conditions 
(e.g. when you are away from artificial light at night) . Under these conditions we 
experience scotoptic vision, with a response curve that peaks at A va c = 507 nm 
and is insensitive to wavelengths longer than 640 nm 12 (see Fig. 2.8). As the 
light gets brighter the less-sensitive cones take over, and we experience photoptic 
vision, with a response curve that peaks at A vac = 555 nm and drops to near zero 
for wavelengths longer than X vac = 700 nm or shorter than A vac = 400 nm (see 
Fig. 2.8). Photometric quantities are usually measured using the bright-light 
(photoptic) response curve since that is what we typically experience in normally 
lit spaces. 



12 Since rods do not detect the longer red wavelengths, it is possible to have artificial red illumina- 
tion without ruining your dark-adapted vision. For example, an airplane can have red illumination 
on the instrument panel without interfering with a pilot's ability to achieve full dark-adapted vision 
to see things outside the cockpit. 
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Photometric units, which may seem a little obscure, were first defined in terms 
of an actual candle with prescribed dimensions made from whale tallow. The 
basic unit of luminous power is called the lumen, defined to be (1/683) W of light 
with wavelength A va c = 555 nm, the peak of the eye's bright-light response. More 
radiant power is required to achieve the same number of lumens for wavelengths 
away from the center of the eye's spectral response. Photometric units are often 
used to characterize room lighting as well as photographic, projection, and display 
equipment. For example, both a 60 W incandescent bulb and a 13 W compact 
fluorescent bulb emit a little more than 800 lumens of light. The difference in 
photometric output versus radiometric output reflects the fact that most of the 
energy radiated from an incandescent bulb is emitted in the infrared, where 
our eyes are not sensitive. Table 2.2 gives the names of the various photometric 
quantities, which parallel the entries for radiometric quantities in Table 2.1. We 
include a variety of units that are sometimes encountered. 

Cones come in three varieties, each of which is sensitive to light in different 
wavelength bands. Figure 2.9 plots the normalized sensitivity curves 13 for short 
(S), medium (M), and long (L) wavelength cones. Because your brain gets separate 
signals from each type of cone, this system gives you the ability to measure 
basic information about the spectral content of light. We interpret this spectral 
information as the color of the light. When the three types of cones are stimulated 
equally the light appears white, and when they are stimulated differently the 
light appears colored. Light with different spectral distributions can produce the 
exact same color sensation, so our perception of color only gives very general 
information about the spectral content of light. For example, light coming from 
a television has a different spectral composition than the light incident on the 
camera that recorded the image, but both can produce the same color sensation. 
This ambiguity can lead to a potentially dangerous situation in the lab because 
lasers from 670 nm to 800 nm all appear the same color. (They all stimulate the 
L and M cones in essentially the same ratio.) However, your eye's response falls 
off quickly in the near-infrared, so a dangerous 800 nm high-intensity beam can 
appear about the same brightness as an innocuous 670 nm laser pointer. 

Because we have have three types of cones, our perception of color can be 
well-represented using a three-dimensional vector space referred to as a color 
space. 14 A color space can be defined in terms of three "basis" light sources 



13 A. Stockman, L. Sharpe, and C. Fach, "The spectral sensitivity of the human short-wavelength 
cones," Vision Research, 39, 2901-2927 (1999); A. Stockman, and L. Sharpe, "Spectral sensitivities 
of the middle- and long-wavelength sensitive cones derived from measurements in observers of 
known genotype," Vision Research, 40, 1711-1737 (2000). 

14 The methods we use to represent color are very much tied to human physiology. Other species 
have photoreceptors that sense different wavelength ranges or do not sense color at all. For instance, 
Papilio butterflies have six types of cone-like photoreceptors and certain types of shrimp have 
twelve. Reptiles have four-color vision for visible light, and pit vipers (a subgroup of snakes) have 
an additional set of "eyes" that look like pits on the front of their face. These pits are essentially 
pinhole cameras sensitive to infrared light, and give these reptiles crude night-vision capabilities. 
(Not surprisingly, pit vipers hunt most actively at night time.) On the other hand, some insects can 
perceive markings on flowers that are only visible in the ultraviolet. Each of these species would 



Luminous Power (of a source): 

Visible light energy emitted per 
time from a source. Units: lumens 
(lm) lm=(l/683) W@555nm 

Luminous Solid-Angle Intensity 
(of a source) Luminous power per 
steradian emitted from a point- 
like source. Units: candelas (cd), 
cd = lm/Sr. 

Luminance (of a source): Lumi- 
nous solid-angle intensity per pro- 
jected area of an extended source. 
(The projected area foreshortens 
by cos0, where 6 is the observa- 
tion angle relative to the surface 
normal.) Units: cd/cm 2 = stilb, 
cd/m 2 = nit, nit = 3183 lambert = 
3.4 footlambert 

Luminous Emittance or Exitance 
(from a source): Luminous Power 
emitted per unit surface area of an 
extended source. Units: lm/cm 2 

Illuminance (to a receiver): Inci- 
dent luminous power delivered 
per area to a receiver. Units: lux; 
lm/m 2 = lux, lm/cm 2 = phot, 
lm/ft 2 = footcandle 



Table 2.2 Photometric quantities 
and units. 




400 500 600 700 
wavelength (nm) 



800 



Figure 2.9 Normalized cone sensi- 
tivity functions 
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400 500 600 

Test Wavelength (nm) 



700 



Figure 2.10 The CIE 1931 RGB 
color matching functions. 



referred to as primaries. Different colors (i.e. the "vectors" in the color space) are 
created by mixing the primary light in different ratios. If we had three primaries 
that separately stimulated each type of rod (S, M, and L), we could recreate any 
color sensation exactly by mixing those primaries. However, by inspecting Fig. 2.9 
you can see that this ideal set of primaries cannot be found because of the overlap 
between the S, M, and L curves. Any light that will stimulate one type of cone will 
also stimulate another. This overlap makes it impossible to display every possible 
color with three primaries. (Although it is possible to quantify all colors with three 
primaries, even if the primaries can't display the colors — we'll see how shortly.) 
The range of colors that can be displayed with a given set of primaries is referred 
to as the gamut of that color space. As your experience with computers suggests, 
we are able engineer devices with a very broad gamut, but there are always colors 
that cannot be displayed. 

The CIE1931 RGB 15 color space is a very commonly encountered color space 
based on a series of experiments performed by W. David Wright and John Guild 
in 1931. In these experiments, test subjects were asked to match the color of a 
monochromatic test light source by mixing monochromatic primaries at 700 nm 
CR), 546.1 nm (G), and 435.8 nm (£). The relative amount of R, G, and B light 
required to match the color at each test wavelength was recorded as the color 
matching functions r(A), g(A), and b{\), shown in Fig. 2.10. Note that the color 
matching functions sometimes go negative. This is most noticeable for f(A), but 
all three have negative values. These negative values indicate that the test color 
was outside the gamut of the primaries (i.e. the color of the test source could not 
be matched by adding primaries) . In these cases, the observers matched the test 
light as closely as possible by mixing primaries, and then they added some of the 
primary light to the test light until the colors matched. The amount of primary 
light that had to be added to the test light was recorded as a negative number. In 
this way they were able to quantify the color, even though it couldn't be displayed 
using their primaries. 

It turns out that the eye responds essentially linearly with respect to color 
perception. That is, if an observer perceives one light source to have components 
{Ri,G\,B\) and another light source to have components (i?2»G2,B2)> amixture 
of the two lights will have components {R\ + R2, G\ + G2, -Bi + B2). This linearity 
allows us to calculate the color components of an arbitrary light source with 
spectrum /(A) by integrating the spectrum against the color matching functions: 



l{K)fd\ 



IU)gdX B 



-I 



I{X)bdX 



(2.63) 



If R, G, or B turn out to be negative for a given 7(A), then that color of light 
falls outside the gamut of these particular primaries. However, the negative 
coordinates still provide a valid abstract representation of that color. 



find the color spaces we use to record and recreate color sensations very inaccurate. 

15 This is not the RGB space you may have probably used on a computer — that space is referred 
to as sRGB. CIE is an abbreviation for the French "Commission Internationale de l'Eclairage," an 
international commission that defines lighting and color standards. 
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The RGB color space is an additive color model, where the primaries are added 
together to produce color and the absence of light gives black. Subtractive color 
models produce color using a background that reflects all visible light equally so 
that it appears white (e.g. a piece of paper or canvas) and then placing absorbing 
pigments over the background to remove portions of the reflected spectrum. 

Some color spaces use four basis vectors. For example, color printers use 
the subtractive CMYK color space (Cyan, Magenta, Yellow, and Black), and some 
television manufacturers add a fourth type of primary (usually yellow) to their 
display. The fourth basis vector increases the range of colors that can be displayed 
by these systems (i.e. it increases the gamut). However, the fourth basis vector 
makes the color space overdetermined and only helps in displaying colors — we 
can abstractly represent all colors using just three coordinates (in an appropriately 
chosen basis). 



Example 2.4 

The CIE1931 XYZ color space is derived from the CIE1931 RGB space by the trans- 
formation 



' X 




1 


Y 




Z 


0.17697 



0.49 


0.31 


0.20 




R 


0.17697 


0.81240 


0.01063 




G 


0.00 


0.01 


0.99 




B 



(2.64) 



where X, Y, and Z are the color coordinates in the new basis. The matrix elements 
in (2.64) were carefully chosen to give this color space some desirable properties: 
none of the new coordinates (X , Y, or Z) are ever negative; the Y gives the pho- 
tometric brightness of the light and the X and Z coordinates describe the color 
part (i.e. the chromatisity) of the light; and the coordinates (1/3,1/3,1/3) give the 
color white. The XYZ coordinates do not represent new primaries, but rather linear 
combinations of the original primaries. Find the representation in the CIE1931 
RGB basis for each of the basis vectors in the XYZ space. 



Solution: We first invert the transformation matrix to find 



R 


0.4185 


-0.1587 


-0.08283 




X 


G -- 


-0.09117 


0.2524 


0.01571 




Y 


B 


0.0009209 


-0.002550 


0.1786 




Z 



Then we can see that X = 0.4185i? - 0.09117G + 0.0009209B, Y = -0.1587.R + 
0.2524G - 0.002550B, and Z = -0.08283J? + 0.01571G + 0.1786B. Because the XYZ 
primaries contain negative amounts of the physical RGB primaries, the XYZ basis 
is not physically realizable. However, it is extensively used because it can abstractly 
represent all colors using a triplet of positive numbers. 



Appendix 2.B Clausius-Mossotti Relation 

Equation (2.35) has the form r m j cro = aE/q e , where a is called the atomic (or 
molecular) polarizability. We take absorption to be negligible so that a is real. E 
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is the macroscopic field in the medium, which includes a contribution from all of 
the dipoles. To avoid double -counting the dipole's own field, we should replace E 
with 

Eactual = E — Edipole (2.65) 

and write 

«E actual (2.66) 

That is, we ought not to allow the dipole's own field to act on itself as we previously 
(inadvertently) did. Here E^ipoie lS the average field that a dipole contributes to its 
quota of space in the material. 

Since N is the number of dipoles per volume, each dipole occupies a volume 
1 / AT. As will be shown below, the average field due to a dipole 16 centered in such 
a volume (symmetrically chosen) is 

Edipole = (2.67) 

Substitution of (2.67) and (2.66) into (2.65) yields 

A/aE actua i E 
Eeffective = E H => E actua i = — (2.68) 

Then ( 2.66) becomes 

aE 

*7e r micro — (2.69) 
1_ 3^ 

Now according to (2.16) the susceptibility is defined via P = e %YL, where E is 
the macroscopic field. Also, the polarization is always based on the combined be- 
havior of all of the dipoles P = Nq e r m i cro (see (2.31)). Therefore, the susceptibility 
is 

Na(io) 

This is known as the Clausius-Mossotti relation. In section 2.3, we only included 
the numerator of (2.70). The extra term in the denominator becomes important 
when N is sufficiently large, which is the case for liquid or solid densities. 

Since we neglect absorption, from (2.25) we have % = n 2 - 1, and we may write 

p Na/e 

n 2 -l = — 2.71 

l-ATa/3£- 

In this case, we may invert the relation to write Na/e in terms of the index: 17 

Na n 2 -l 

=3^— (2.72) 

e n 2 + 2 



16 In principle, the detailed fields of nearby dipoles should also be considered rather than repre- 
senting their influence with the macroscopic field. However, if they are symmetrically distributed 
the result is the same. See J. D. lackson, Classical Electrodynamics, 3rd ed., Sect. 4.5 (New York: lohn 
Wiley, 1999). 

17 This form of Clausius-Mossotti relation, in terms of the refractive index, was renamed the 
Lorentz-Lorenz formula, but probably undeservedly so, since it is essentially the same formula. 
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Example 2.5 

Xenon vapor at STP (density 4.46 x 10~ 5 mol/cm 3 ) has index n = 1.000702 measured 
at wavelength 589nm. Use (a) the Clausius-Mossotti relation (2.70) and (b) the 
uncorrected formula (i.e. numerator only) to predict the index for liquid xenon 
with density 2.00 x 10~ 2 mol/cm 3 Compare with the measured value of n — 1.332. 18 



Solution: At the low density, we may may safely neglect the correction in the 
denominator of (2.71) and simply write N atm a/e = 1.000702 2 - 1 = 1.404 x 10" 3 . 
The liquid density Af liquid is 2.00 x 10" 2 /4.46 x 10" 5 = 449 times greater. Therefore, 
Wiiquid«/eo = 449 x 1.404 x 10" 3 = 0.630. (a) According to Clausius-Mossotti (2.71), 
the index is 




0.630 
1-0.630/3 



(b) On the other hand, without the correction in the denominator, we get 

w = Vl + 0.630= 1.277 
The Clausius-Mossotti formula gets much closer to the measured value. 



Average Field Produced by a Dipole 

Consider a dipole comprised of point charges + q e separated by spacing r m ; cro = id. 
If the dipole is centered on the origin, then by Coulomb's law the field surrounding 
the point charges is 



q e 



-Mil 



q e r + zrf/2 



4tt£- |r-zd/2| 3 4ne | r +zd/2| 3 

We wish to compute the average field within a cubic volume V = I? that symmet- 
rically encompasses the dipole. 19 We take the volume dimension L to be large 
compared to the dipole dimension d. Integrating the field over this volume yields 



L/Z L/Z L/Z 

/Edv = ^ e I dx I dy I dz 
4ne J J J 



-L/2 -L/2 -L/2 
L/2 L/2 



xx + yy+ (z- dl2)i 



xx + yy+ (z + d/2)z 



-z 

2ne a 



j dx J dy 

-L/2 -L/2 



[x 2 + y 2 + (z- d/2) 2 } 312 [x 2 + y 2 + (z + d/2) 2 f 2 
1 1 



x 2 + y 2 + [L-d) 2 14 Jx 2 + y 2 + (L + d) 2 /4 



18 D. H. Garside, H. V Molgaard, and B. L. Smith, "Refractive Index and Lorentz-Lorenz function 
of Xenon Liquid and Vapour," ]. Phys. B: At. Mol. Phys. 1, 449-457 (1968). 

19 Authors often obtain the same result using a spherical volume with the (usually unmentioned) 
conceptual awkwardness that spheres cannot be closely packed to form a macroscopic medium 
without introducing voids. 




Figure 2.11 The field lines sur- 
rounding a dipole. 
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The terms multiplying x and y vanish since they involve odd functions integrated 
over even limits on either x or y, respectively. On the remaining term, the integra- 
tion on z has been executed. Before integrating the remaining expression over x 
and y, we make the following approximation based onI>>d: 

1 1 1 



^x 2 + y 2 + [ L±d?l4 V^7^ v /I7^™= 



\/x 2 + y 2 + L 2 /4 



1 + 



LdIA 



x 2 + y 2 + L 2 /4 



which will make integration considerably easier. 20 Then integration over the y 
dimension brings us to 21 



L/2 L/2 

q e d C r Ldy 



r q e d r r 

I Edv--z / dx I — 

J 4ne a J J \ x 2 



q e d 



f (x 2 



L 2 dx 



~4ne J 2 """" J /2 [ x 2 + y 2 + L 2 /4 ]3/2 47re J 2 ( x 2 + £2/4 j ^/ x 2 + L 2/ 2 

The final integral is the same as twice the integral from to L/2. Then, with x > 0, 
we can employ the variable change 5 = x 2 + L 2 14 => 2dx — dsl Vs- L 2 /4 and obtain 



J 4ne J 



UI2 



L 2 ds 



47re J sVs 2 -L 4 1 16 

L 2 IA 



, q e d 4n 
4ne a 3 



Reinstalling r m i cro — zd and dividing by the volume 1/N, allotted to individual 
dipoles, brings us to the anticipated result (2.67). 



Appendix 2.C Energy Density of Electric Fields 

In this appendix we show that the term e E 2 /2 in (2.53) corresponds to the energy 
density of an electric field. 22 The electric potential </>(r) (in units of energy per 
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One might be tempted to begin this calculation with the well-known dipole field 



1e 



4jie r 3 



-id/2 



r + zd/2 



4r 2 



3/2 



r 4 r 2 



_ q e d 



which relies on the approximation 



\ + i-idlr + d 2 l4r 2 



-3/2 



[l + z-idlr] 



312 



-3/2 , 



[3f (z-r)-z] 



1 + - 



3rfz-f 
2r 



This dipole-field expression, while useful for describing the field surrounding the dipole, contains 
no information about the fields internal to the diple. Note that we integrate z through the origin, 
which would violate the above assumption r » d. Alternatively, the influence of the internal fields 
on our integral could be accomplished using a delta function as is done in J. D. Jackson, Classical 
Electrodynamics, 3rded., p. 149 (New York: John Wiley, 1999). 
21 Two useful integral formulas are (0.61) and (0.61). 

22 J. R. Reitz, F. J. Milford, and R. W. Christy, Foundations of Electromagnetic Theory 3rd ed., Sect. 
6-3 (Reading, Massachusetts: Addison-Wesley, 1979). 
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charge, or volts) describes the potential energy that a charge would experience 
if placed at any given point in the field. The electric field and the potential are 
connected through 

E(r) = -V(/)(r) (2.73) 

The energy U necessary to assemble a distribution of charges (owing to attraction 
or repulsion) can be written in terms of a summation over all of the charges (or 
charge density p (r)) located within the potential: 

U=^f (f)(r)p{r)dv (2.74) 

v 

We consider the potential to arise from the charges themselves. The factor 1/2 
is necessary to avoid double counting. To appreciate this factor consider just 
two point charges: We only need to count the energy due to one charge in the 
presence of the other's potential to obtain the energy required to bring the charges 
together. 

A substitution of (1.1) for p (r) into (2.74) gives 

U= j J (l>{r)V-E(r)dv (2.75) 

v 

Next, we use the vector identity in P0.9 and get 

U= 1 Jv-[(/)(r)E(r)]rfy-^ J E (r) • V<£ (r) d v (2.76) 

v v 

An application of the divergence theorem (0.11) on the first integral and a substi- 
tution of (2.73) into the second integral yields 

U=jj(f){r)E{r)-nda+j ^E{r)-E{r)dv (2.77) 

s v 

We can consider the volume V (enclosed by S) to be as large as we like, say 
a sphere of radius R, so that all charges are contained well within it. Then the 
surface integral over S vanishes as R — oo since </> ~ l/R and E ~ l/R 2 , whereas 
da~ R 2 . Then the total energy is expressed solely in terms of the electric field: 



u E {r)dv (2.78) 



where 



All 
Space 



u E (r) = € -^- (2.79) 



is interpreted as the energy density of the electric field. 
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Appendix 2.D Energy Density of Magnetic Fields 

In a derivation similar to that in appendix 2.C, we consider the energy associated 
with magnetic fields. 23 The magnetic vector potential A (r) (in units of energy 
per charge x velocity) describes the potential energy that a charge moving with 
velocity v would experience if placed in the field. The magnetic field and the 
vector potential are connected through 

B(r) = VxA(r) (2.80) 

The energy U necessary to assemble a distribution of currents can be written in 
terms of a summation over all of the currents (or current density J (r)) located 
within the vector potential field: 



-If' 



J(r)-A(r)di/ (2.81) 

v 

As in (2.74), the factor 1/2 is necessary to avoid double counting the influence of 
the currents on each other. 

Under the assumption of steady currents (no variations in time) , we may 
substitute Ampere's law (1.21) into (2.81), which yields 

U= f[VxB(r)]-A(r)di/ (2.82) 

2/Uo J 

v 

Next we employ the vector identity P0.8 from which the previous expression 
becomes 

U= — [ B(r)-[VxA(r)]dy-— f V • [A(r) x B (r)] dv (2.83) 
2jUo J 2^o J 

V V 

Upon substituting (2.80) into the first equation and applying the Divergence 
theorem (0.1 1) on the second integral, this expression for total energy becomes 

U=— [B{r)-B{r)dv- — £[A{r)xB{r)]-nda (2.84) 
2nJ 2^ J 

v s 

As was done in connection with (2.77), if we choose a large enough volume (a 
sphere with radius R —- oo), the surface integral vanishes since A ~ 1/R and 
B ~ 1/R 2 , whereas da ~ R 2 . The total energy (2.84) then reduces to 



u B (r)dv (2.85) 



where 



All 

Space 



B 2 

u B {r) = — (2.86) 
2^o 



is the energy density for a magnetic field. 

23 J. R. Reitz, F. J. Milford, and R. W. Christy, Foundations of Electromagnetic Theory 3rd ed., Sect. 
12-2 (Reading, Massachusetts: Addison-Wesley, 1979. 
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Exercises 



Exercises for 2.3 The Lorentz Model of Dielectrics 



P2.1 



Verify that (2.35) is a solution to (2.34). 



P2.2 



Derive the Sellmeier equation 



n 2 = 1 + 




'O.vac 



from (2.39) for a gas with negligible absorption (i.e. y = 0, valid far 
from resonance a) ), where A , V ac corresponds to frequency a) and A is 
a constant. Many materials (e.g. glass, air) have strong resonances in 
the ultraviolet. In such materials, do you expect the index of refraction 
for blue light to be greater than that for red light? Make a sketch of n as 
a function of wavelength for visible light down to the ultraviolet (where 
^•o.vac is located). 

P2.3 In the Lorentz model, take N = 10 28 m" 3 for the density of bound 
electrons in an insulator (note that N is number per volume, not just 
number), and a single transition at a) = 6 x 10 15 rad/sec (in the UV), 
and damping y - (o /5 (quite broad). Assume E is 10 4 V/m. 

For three frequencies (o = (o -2y,(o = (o , and oo = (o + 2y find the mag- 
nitude and phase (relative to the phase of E e l(kr ~ tlJt) ) of the following 
quantities. Give correct SI units with each quantity. You don't need to 
worry about vector directions. 



(a) The charge displacement amplitude r micro (2.35) 

(b) The polarization P(ai) 

(c) The susceptibility %{(x)). What would the susceptibility be for twice 
the E-field strength as before? 

For the following no phase is needed: 

(d) Find n and k at the three frequencies. You will have to solve for the 
real and imaginary parts of [n + ix) 2 = 1 + %{oj). 

(e) Find the three speeds of light in terms of c. Find the three wave- 
lengths A. 

(f ) Find how far light penetrates into the material before only 1 / e of the 
amplitude of E remains. Find how far light penetrates into the material 
before only lie of the intensity / remains. 



P2.4 (a) Use a computer graphing program and the Lorentz model to plot n 
and k as a function of a> frequency for a dielectric (i.e. obtain graphs 
such as the ones in Fig. 2.5). Use these parameters to keep things 
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simple: a> p = 1, (o = 10, and y = 1; plot your function from a> = to 
oi = 20. 

(b) Plot n and rasa function of frequency for a material that has 
three resonant frequencies: a) ol = 10, yi = 1, /i = 0.5; o> 2 = 15, 72 = 1, 
/2 = 0.25; and <y 3 = 25, 73 = 3, /3 = 0.25. Use a> p = 1 for all three 
resonances, and plot the results from a> = to 00 = 30. Comment on 
your plots. 

Exercises for 2.4 Index of Refraction of a Conductor 

P2.5 For silver, the complex refractive index is characterized by n = 0.13 
and k = 4.O. 24 Find the distance that light travels inside of silver before 
the field is reduced by a factor of 1 / e. Assume a wavelength of X vac = 
633 nm. What is the speed of the wave crests in the silver (written as a 
number times c)? Are you surprised? 

P2.6 Use (2.27), (2.29), and (2.48) to estimate the index of silver at X = 
633nm. The density of free electrons in silver is N = 5.86 x 10 28 m~ 3 and 
the DC conductivity isa = 6.62 x 10 7 C 2 / (J • m • s) . 25 Compare with the 
actual index given in P2.5. 

Answer: n + ix = 0.02 + i'4.50 

P2.7 The dielectric model and the conductor model give identical results 
for n in the case of a low- density plasma where there is no restoring 
force (i.e. a) = 0) and no dragging term (i.e. 7 = 0). Use this to model 
the ionosphere (the uppermost part of the atmosphere that is ionized 
by solar radiation to form a low-density plasma). 

(a) If the index of refraction of the ionosphere is n = 0.9 for an FM 
station at v = a)/2n = 100 MHz, calculate the number of free electrons 
per cubic meter. 

(b) What is the complex refractive index of the ionosphere for radio 
waves at 1160 kHz (KSL radio station)? Is this frequency above or below 
the plasma frequency? Assume the same density of free electrons as in 
part (a) . 

For your information, AM radio reflects better than FM radio from the 
ionosphere (like visible light from a metal mirror) . At night, the lower 
layer of the ionosphere goes away so that AM radio waves reflect from 
a higher layer. 

P2.8 Use a computer to plot n and k as a function of frequency for a con- 
ductor (obtain plots such as the ones in Fig. 2.6). Use these parame- 
ters to keep things simple, let 7 = 0.02w p and plot your function from 
a) = 0.6a) p toa> = 2a) p . 

24 Handbook of Optical Constants of Solids, Edited by E. D. Palik (Elsevier, 1997). 
25 G. Burns, Solid State Physics, p. 194 (Orlando: Academic Press, 1985). 
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Exercises for 2.6 Irradiance of a Plane Wave 

P2.9 In the case of a linearly-polarized plane wave, where the phase of each 
vector component of E is the same, re-derive (2.61) directly from the 
real field (2.21). For simplicity, you may ignore absorption (i.e. k = 0). 

HINT: The time-average of cos 2 (k • r - a> t + (p) is 1 12. 

P2.10 (a) Find the intensity (in W/cm 2 ) produced by a short laser pulse (lin- 
early polarized) with duration Af = 2.5 x 10~ 14 s and energy E = 100 mj, 
focused in vacuum to a round spot with radius r = 5 jim. 

(b) What is the peak electric field (in V/A)? 
HINT: The SI units of electric field are N/C = V/m. 

(c) What is the peak magnetic field (in T = kg/ (s • Q? 

P2.1 1 (a) What is the intensity (in W/cm 2 ) on the retina when looking directly 
at the sun? Assume that the eye's pupil has a radius r pup ii = 1 mm. 
Take the Sun's irradiance at the earth's surface to be 1.4 kW/m 2 , and 
neglect refractive index (i.e. set n = 1). HINT: The Earth-Sun distance 
is d = 1.5 x 10 8 km and the pupil-retina distance is d y = 22 mm. The 
radius of the Sun rs un = 7.0 x 10 5 km is de-magnified on the retina 
according to the ratio d\ld . 

(b) What is the intensity at the retina when looking directly into a 
1 mW HeNe laser? Assume that the smallest radius of the laser beam 
is r wa i st = 0.5 mm positioned d = 2 m in front of the eye, and that the 
entire beam enters the pupil. Compare with part (a). 

P2.12 Show that the magnetic field of an intense laser with A = 1 jum becomes 
important for a free electron oscillating in the field at intensities above 
10 18 W/cm 2 . This marks the transition to relativistic physics. Neverthe- 
less, for convenience, use classical physics in making the estimate. 

HINT: At lower intensities, the oscillating electric field dominates, so 
the electron motion can be thought of as arising solely from the electric 
field. Use this motion to calculate the magnetic force on the mov- 
ing electron, and compare it to the electric force. The forces become 
comparable at 10 18 W/cm 2 . 

Exercises for 2 A Radiometry, Photometry, and Color 

P2.13 The CIE1931 RGB color matching function f (A), g(A), and HA) can be 
transformed using (2.64) to obtain color matching functions for the 
XYZ basis: x(A), y(A), and z(A), plotted in Fig 2.12. As with the RGB 
color matching functions, the XYZ color matching functions can be 
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Figure 2.12 Color matching func- 
tions for the CIE XYZ color space 



used to calculate the color coordinates in the XYZ basis for an arbitrary 
spectrum: 



X= I I{X)xdX 



-J 



I{X)zdX (2.87) 



(The function y(A) was chosen to be exactly the scoptic response curve 
(shown in Fig. 2.8), so that Y describes the photometric brightness of 
the light.) 

Obtain a copy of the XYZ color matching functions from www.cvrl.org 
and calculate the XYZ color coordinates for the spectrum 

/(A) = / oe - (A - 500 nm)2 

P2.14 The color space you've probably encountered most is sRGB, used to rep- 
resent color on computer displays. The sRGB coordinates are related 
to the XYZ coordinates by the transformation 



R 




3.2406 


G 




-0.9689 


B 




0.0557 



-1.5372 
1.8758 
-0.2040 



-0.4986 
0.0415 
1.0570 





X 




Y 




Z 



where the XYZ coordinates need to be scaled to values similar to those 
accepted by the sRGB device (commonly to 255) and then the sRGB 
coordinates R, G, and B need to be scaled or clipped to fit in the appro- 
priate range. (This scaling and clipping result from the fact that your 
monitor cannot display arbitrarily bright light.) 

Obtain a copy of the XYZ color matching functions from www.cvrl.org 
and use it to calculate the sRGB components for monochromatic light 
from Ao = 400 nm to Ao = 700 nm in 1 nm intervals. Make a plot of 
the individual sRGB values and also use the coordinates to display a 
rainbow. HINT: Matlab has all the functions you need to display the 
rainbow. 



Chapter 3 

Reflection and Refraction 



As we know from everyday experience, when light arrives at an interface between 
materials it is partially reflected and partially transmitted. In this chapter, we 
examine what happens to plane waves when they propagate from one material 
(characterized by index n or even by complex index JV) to another material. We 
will derive expressions to quantify the amount of reflection and transmission. The 
results depend on the angle of incidence (i.e. the angle between k and the normal 
to the surface) as well as on the orientation of the electric field (called polarization 
- not to be confused with P, also called polarization). In this chapter, we consider 
only isotropic materials (e.g. glass); in chapter 5 we consider anisotropic materials 
(e.g. a crystal). 

As we develop the connection between incident, reflected, and transmitted 
lightwaves, 1 several familiar relationships will emerge naturally (e.g. Snell's law 
and Brewster's angle). The formalism also describes polarization-dependent 
phase shifts upon reflection (especially interesting in the case of total internal 
reflection or in the case of reflections from absorbing surfaces such as metals) . 

For simplicity, we initially neglect the imaginary part of the refractive index. 
Each plane wave is thus characterized by a real wave vector k. We will write each 
plane wave in the form E(r, t) = E exp [i (k • r - (o t)], where, as usual, only the real 
part of the field corresponds to the physical field. The restriction to real refractive 
indices is not as serious as it might seem. The use of the letter n instead of Jf 
hardly matters. The math is all the same, which demonstrates the power of the 
complex notation. We can simply update our expressions in the end to include 
complex refractive indices, but in the mean time it is easier to think of absorption 
as negligible. 

3. 1 Refraction at an Interface 

Consider a planar boundary between two materials with different indexes. The 
index n\ characterizes the material on the left, and the index n t characterizes the 

1 See M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 1.5 (Cambridge University Press, 
1999). 
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Figure 3.1 Incident, reflected, and 
transmitted plane wave fields at a 
material interface. 



material on the right as depicted in the Fig. 3.1. When a plane wave traveling in 
the direction ki is incident the boundary from the left, it gives rise to a reflected 
vector traveling in the direction k r and a transmitted plane wave traveling in the 
direction k t . The incident and reflected waves exist only to the left of the material 
interface, and the transmitted wave exists only to the right of the interface. The 
angles 9i, 9 T , and 9 t give the angles that each respective wave vector (ki, k r , and 
k t ) makes with the normal to the interface. 

For simplicity, we'll assume that both of the materials are isotropic here. 
(Chapter 5 discusses refraction for anisotropic materials.) In this case, ki, k r , and 
k t all lie in a single plane, referred to as the plane of incidence, (i.e. the plane 
represented by the surface of this page) . We are free to orient our coordinate 
system in many different ways (and every textbook seems to do it differently!). 2 
We choose the y-z plane to be the plane of incidence, with the z-direction normal 
to the interface and the x-axis pointing into the page. 

The electric field vector for each plane wave is confined to a plane perpendic- 
ular to its wave vector. We are free to decompose the field vector into arbitrary 
components as long as they are perpendicular to the wave vector. It is customary 
to choose one of the electric field vector components to be that which lies within 
the plane of incidence. We call this p-polarized light, where p stands for parallel to 
the plane of incidence. The remaining electric field vector component is directed 
normal to the plane of incidence and is called s-polarized light. The s stands for 
senkrecht, a German word meaning perpendicular. 

Using this system, we can decompose the electric field vector Ej into its p- 
polarized component E^ p and its s-polarized component E^ s \ as depicted in 
Fig. 3.1. The 5 component Er' is represented by the tail of an arrow pointing 
into the page, or the x- direction in our convention. The other fields E r and E t 
are similarly split into s and p components as indicated in Fig. 3.1. All field 
components are considered to be positive when they point in the direction of 
their respective arrows. 3 Note that the s-polarized components are parallel for 
all three plane waves, whereas the p-polarized components are not (except at 
normal incidence) because each plane wave travels in a different direction. 

By inspection of Fig. 3.1, we can write the various wave vectors in terms of the 
y and z unit vectors: 

ki = fci (ysin^i + zcos0i) 

k r = fc r (ysin6> r -zcos# r ) (3.1) 
kt = k t (ysinf? t + zcos# t ) 

Also by inspection of Fig. 3.1 (following the conventions for the electric fields 
indicated by the arrows), we can write the incident, reflected, and transmitted 



2 For example, our convention is different than that used by E. Hecht, Optics, 3rd ed., Sect. 4.6.2 
(Massachusetts: Addison-Wesley, 1998). 

3 Many textbooks draw the arrow for e[ p ' in the direction opposite of ours. However, that choice 
leads to an awkward situation at normal incidence (i.e. 0; = Q t = 0) where the arrows for the incident 
and reflected fields are parallel for the s-component but anti parallel for the p-component. 
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fields in terms of x, y, and z: 

Ef> (ycos0i-zsin0i) +x£f° 
(ycos0 r + zsin0 r ) +xE[ s) 
(y cos 6 t - z sin t ) + x£ t (s) 



Ei = 




E r = 




E t = 


Ef 



> ' [ fe {y sinflj +zcos0i)-[tijf] 
5 / [fc r (ysin0 r -.zcosS r )-w r t] 



J[k t (ysm6 t +zcos8 t )-(i) t t] 



(3.2) 



Each field has the form (2.8), and we have utilized the k-vectors (3.1) in the 
exponents of (3.2). 

Now we are ready to connect the fields on one side of the interface to the 
fields on the other side. This is done using boundary conditions. As explained 
in appendix 3. A, Maxwell's equations require that the component of E that are 
parallel to the interface must be the same on either side of the boundary. In 
our coordinate system, the x and y components are parallel to the interface, and 
z = defines the interface. This means that at z = the x and y components 
of the combined incident and reflected fields must equal the corresponding 
components of the transmitted field: 



E^ycosdi+xEl 



gi(fciysin0j-iiiif) _j_ 



E^ycosflr+xEj 



£ t (p) ycos0 t + xE t l 



:) i(k I ysin8 I -ci) I t) 
i (k t y sm8 t -o> t t) 



(3.3) 



Since this equation must hold for all conceivable values of t and y, we are com- 
pelled to set all the phase factors in the complex exponentials equal to each other. 
The time portion of the phase factors requires the frequency of all waves to be the 
same: 

(x)\ = a) r = = a) (3.4) 

(We could have guessed that all frequencies would be the same; otherwise wave 
fronts would be annihilated or created at the interface.) Similarly, equating the 
spatial terms in the exponents of (3.3) requires 



fcisin6>i = fc r sin0 r = fc t sin0 t 



(3.5) 



Now recall from (2.19) the relations k[ = k r = nico/c and k t = n t a)/c. With these 
relations, (3.5) yields the law of reflection 



and Snell's law 



Hisinfli = n t sin0 t 



(3.6) 



(3.7) 



The three angles 6\, 9 T , and 6 t are not independent. The reflected angle matches 
the incident angle, and the transmitted angle obeys Snell's law. The phenomenon 
of refraction refers to the fact that 6\ and 9 t are different. 

Because the exponents are all identical, (3.3) reduces to two relatively simple 
equations (one for each dimension, x and y): 




Figure 3.2 Animation of s- and 
p-polarized fields incident on an 
interface as the angle of incidence 
is varied. 




Willebrord Snell (or Snellius) (1580- 
1626, Dutch) was an astronomer and 
mathematician born in Leiden, Nether- 
lands. In 1613 he succeeded his father 
as professor of mathematics at the 
University of Leiden. He was an accom- 
plished mathematician, developing a 
new method for calculating n as well as 
an improved method for measuring the 
circumference of the earth. He is most 
famous for his rediscovery of the law of 
refraction in 1621. (The law was known 
(in table form) to the ancient Greek 
mathematician Ptolemy, to Persian en- 
gineer Ibn Sahl (900s), and to Polish 
philosopher Witelo (1200s).) Snell au- 
thored several books, including one on 
trigonometry, published a year after his 
death. (Wikipedia) 



eI s] + e^ ] 



(3.8) 
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and 



(El p) + E i r p) )cos6 i = E< t p) cose t 



(3.9) 



We have derived these equations from the boundary condition (3.52) on the 
parallel component of the electric field. This set of equations has four unknowns 
{Ef P \ E^ s \ e[ p \ and £ t (s) ), assuming that we pick the incident fields. We require 
two further equations to solve the system. These are obtained using the separate 
boundary condition on the parallel component of magnetic fields given in (3.56) 
(also discussed in appendix 3. A). 

From Faraday's law (1.3), we have for a plane wave (see (2.56)) 



k x E n 
B= = -uxE 

0) c 



(3.10) 



where usk/fcisa unit vector in the direction of k. We have also utilized (2.19) 
for a real index. This expression is useful for writing Bj, B r , and B t in terms of the 
electric field components that we have already introduced. When injecting (3.1) 
and (3.2) into (3.10), the incident, reflected, and transmitted magnetic fields turn 
out to be 



Bi = 



B r = 



B t = 



c 

rh 
c 



-xE^+E^ (-zsin0 i+ ycos0i)] ^[Mysin^+zcoseO-M 
xE\ p] +E [ r s) (-zsin0 r -ycos0 r )j 
-xE ip) +E[ S) (-zsin0 t +ycos0 t )] <?' 



, i [ fc r (y sin r - z cos r ) - u> 1 1] 



(3.11) 



[fc t (y sm9 t +zcos8 t )-ii> t t] 



Next, we apply the boundary condition (3.56), namely that the components of B 
parallel to the interface (i.e. in the x and y dimensions) are the same 4 on either 
side of the plane z = 0. Since we already know that the exponents are all equal 
and that 6 r = 8i and ni - n T , the boundary condition gives 



-x£f" + £f ; ycos0i 



Hi 
+ — 
C 



c 



-xEf ] + El s >ycos6 t 



xEf - £< s) ycos0i] 

(3.12) 

As before, (3.12) reduces to two relatively simple equations (one for the x dimen- 
sion and one for the y dimension): 



(P) 



and 



n-\E\ p) -E {p) ) = n i E\ 



m [eI s) - £< s) ) cos0i = n x E[ s) cos0 t 



(3.13) 



(3.14) 



These two equations together with (3.8) and (3.9) allow us to solve for the reflected 
E r and transmitted fields E t for the 5 and p polarization components. However, 
(3.8), (3.9), (3.13), and (3.14) are not yet in their most convenient form. 



4 We assume the permeability ji is the same everywhere — no magnetic effects. 
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3.2 The Fresnel Coefficients 



Augustin Fresnel first derived the results of the previous section. Since he lived 
well before Maxwell's time, he did not have the benefit of Maxwell's equations as 
we have. Instead, Fresnel thought of light as transverse mechanical waves prop- 
agating within materials. (Fresnel was naturally a proponent of 'luminiferous 
ether'.) Instead of relating the parallel components of the electric and magnetic 
fields across the boundary between the materials, Fresnel used the principle that, 
as a transverse mechanical wave propagates from one material to the other, the 
two materials should not slip past each other at the interface. This "gluing" of the 
materials at the interface also forbids the possibility of the materials detaching 
from one another (creating gaps) or passing through one another as they expe- 
rience wave vibrations. This mechanical approach to light worked splendidly 
and explained polarization effects along with the variations in reflectance and 
transmittance as a function of the incident angle of the light. 

Fresnel wrote the relationships between the various plane waves depicted 
in Fig. 3.1 in terms of coefficients that compare the reflected and transmitted 
field amplitudes to those of the incident field. He then calculated the ratio of 
the reflected and transmitted field components to the incident field components 
for each polarization. In the following example, we illustrate this procedure for 
s-polarized light. It is left as a homework exercise to solve the equations for 
p-polarized light (see P3.1). 

Example 3.1 

Calculate the ratio of transmitted field to the incident field and the ratio of the 
reflected field to incident field for s-polarized light. 



Solution: We use (3.8) 



and (3.14), which with the help of Snell's law is written 



smy t cost 



If we add these two equations, we get 



2E S} = 



1 + 



sintfi cost 



sin0 t cos0i 



(3.15) 



(3.16) 



and after dividing by £. (s) and doing a little algebra, it turns into 



2sinf3 t cos^i 



£. (s) sin0 t cos0i + sin0;cos0t 
To get the ratio of reflected to incident, we subtract (3.15) from (3.8) to obtain 



2Ei s] = 



1 



sintfi cost 



sin0 t cos6>i 



(3.17) 




Augustin Fresnel (1788-1829, French) 
was born in Broglie, France, the son of 
an architect. As a child, he was slow to 
develop and still could not read when 
he was eight years old, but by age six- 
teen he excelled and entered the Ecole 
Polytechnique where he earned distinc- 
tion. As a young man, Fresnel began a 
successful career as an engineer, but he 
lost his post in 1814 when Napoleon re- 
turned to power. (Fresnel had supported 
the Bourbons.) This difficult year was 
when Fresnel turned his attention to 
optics. Fresnel became a major propo- 
nent of the wave theory of light and 
four years later wrote a paper on diffrac- 
tion for which he was awarded a prize 
by the French Academy of Sciences. A 
year later he was appointed commis- 
sioner of lighthouses, which motivated 
the invention of the Fresnel lens (still 
used in many commercial applications). 
Fresnel was under appreciated before 
his untimely death from tuberculosis. 
Many of his papers did not make it into 
print until years later. Fresnel made 
huge advances in the understanding of 
reflection, diffraction, polarization, and 
birefringence. In 1824 Fresnel wrote 
to Thomas Young, "All the compli- 
ments that I have received from Arago, 
Laplace and Biot never gave me so 
much pleasure as the discovery of a 
theoretic truth, or the confirmation of 
a calculation by experiment." Augustin 
Fresnel is a hero of one of the authors 
of this textbook. (Wikipedia) 
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9i 



Figure 3.3 The Fresnel coefficients 
plotted versus 0; for the case of an 
air-glass interface with n\ — 1 and 
n t = 1.5. 



and then divide (3.17) by (3.16). After a little algebra, we arrive at 
Ef sin0 t cos0i-sin0;cos0t 



sin tit cos ti\ + sin ti\ cos t 



The ratio of the reflected and transmitted field components to the incident 
field components are specified by the following coefficients, called the Fresnel 
coefficients: 



r s ; 



£ r (s) _ sin 
£. (s) sin( 



,in0 t cos0i-sin0jcos0 t sin(0j — 6 t ) njcos^i- n t costi t 

— — (3.18) 

iin0 t cos0i + sin0icos0t sin(0i + t ) «icos0i+ « t cos0 t 

Hi" 2 sin t cos ft 2 sin t cos ft 2 cos ft 

-4t = ! = = ! ! (3.19) 

Er> sinftcosft + sinfljcosft; sin(0i + ftj nicosft + n t cosO t 

E r cosftsinft-cosftsinS; tan (ft. -ft) ft; cos ft- rat cos ft 
jyW cos0 t sin0 t + cos0isin0i tan(0j + t ) «;cos0 t + « t cos0i 

t (p) 2cos0isin0 t 2cos0;sin0 t 2«;cos0i 

o 

(3.21) 



_ 2cos0isin0 t 2cos0;sin0 t 2«icos0j 

p £ ( P' cos0 t sin0t + cos0isin6<j sin (0; + t ) cos {6\ - 9 t ) «icos0 t + n t cos8i 

f.3.: 



All of the above forms of the Fresnel coefficients are potentially useful, depending 
on the problem at hand. Remember that the angles in the coefficient are not inde- 
pendently chosen, but are subject to Snell's law (3.7). (The right-most expression 
for each coefficient is obtained from the first form using Snell's law). 

The Fresnel coefficients pin down the electric field amplitudes on the two 
sides of the boundary. They also keep track of phase shifts at a boundary. In 
Fig. 3.3 we have plotted the Fresnel coefficients for the case of an air-glass inter- 
face. Notice that the reflection coefficients are sometimes negative in this plot, 
which corresponds to a phase shift of n upon reflection (remember e in = -1). 
Later we will see that when absorbing materials are encountered, more compli- 
cated phase shifts can arise due to the complex index of refraction. 



3.3 Reflectance and Transmittance 

We are often interested in knowing the fraction of intensity that transmits through 
or reflects from a boundary. Since intensity is proportional to the square of the 
amplitude of the electric field, we can write the fraction of the light reflected from 
the surface, or reflectance, in terms of the Fresnel coefficients: 



As) 
h 

As) 



As) 



As) 



and 



Rr. 



rCp) 
h 



(3.22) 



These expressions are applied individually to each polarization component (s or 
p) . The intensity reflected for each of these orthogonal polarizations is additive 
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because the two electric fields are orthogonal and cannot interfere with each 
other. The total reflected intensity is therefore 



r(total) 



/ r w +/ r (p) 



where the incident intensity is given by (2.62): 



r (total) 



j(s) + j{p) 



-me c 
2 





E[ s) 


2 

+ 




2 













(3.23) 



(3.24) 



Since intensity is power per area, we can rewrite (3.23) as incident and re- 
flected power: 

(3.25) 



j (total) 



P^+P, 



R s pM + R p Pl p) 



Using this expression and requiring that energy be conserved (i.e. Pf 



(total) _ j> (total) _j_ 



P( total) ), we find that the portion of the power that transmits is 



p (total) _ 



(3.26) 



[Pj* + pV)-[P<* + pW) 

= o.-r,)p® + [i-r p )p® 

From this expression we see that the transmittance (i.e. the fraction of the light 
that transmits) for either polarization is 

T s =l-R s and T p = l-R p (3.27) 

Figure 3.4 shows typical reflectance and transmittance values for an air-glass 
interface. 

You might be surprised at first to learn that 



T s *\t s 



and 



(3.28) 



However, recall that the transmitted intensity (in terms of the transmitted fields) 
depends also on the refractive index. The Fresnel coefficients t s and t p relate the 
bare electric fields to each other, whereas the transmitted intensity (similar to 
(3.24)) is 



/ (totai, = / („ +/ ( P ) = l nteoC 



7 W 



7 (P) 



(3.29) 



Therefore, we expect T s and T p to depend on the ratio of the refractive indices n\ 
and n\ as well as on the squares of t s and t p . 

There is another more subtle reason for the inequalities in (3.28). Consider 
a lateral strip of light associated with a plane wave incident upon the material 
interface in Fig. 3.5. Upon refraction into the second medium, the strip is seen 
to change its width by the factor cos0 t / cosSj. This is a geometrical effect, owing 
to the change in propagation direction at the interface. The change in direction 
alters the intensity (power per area) but not the power. In computing the trans- 
mittance, we must remove this geometrical effect from the ratio of the intensities, 
which leads to the following transmittance coefficients: 

n t cos0 t ,? 



HiCOS^i 

n t cos0 t I 
mcosOi 



(valid when no total internal reflection) (3.30) 
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Figure 3.4 The reflectance and 
transmittance plotted versus 6[ for 
the case of an air-glass interface 
with rii — 1 and n t = 1.5. 




Figure 3.5 Light refracting into a 
surface 
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David Brewster (1781-1868, Scot- 
tish) was born in Jedburgh, Scottland. 
His father was a teacher and wanted 
David to become a clergyman. At age 
twelve, David went to the University of 
Edinburgh for that purpose, but his incli- 
nation for natural science soon became 
apparent. He became licensed to preach, 
but his interests in science distracted 
him from that profession, and he spent 
much of his time studying diffraction. 
Taking an empirical approach, Brewster 
independently discovered many of the 
same things usually credited to Fresnel. 
He even made a dioptric apparatus for 
lighthouses before Fresnel developed 
his. Brewster became somewhat fa- 
mous in his day for the development 
of the kaleidoscope and stereoscope 
for enjoyment by the general public. 
Brewster was a prolific science writer 
and editor throughout his life. Among 
his works is an important biography of 
Isaac Newton. He was knighted for his 
accomplishments in 1831. (Wikipedia) 



Note that (3.30) is valid only if a real angle 6 t exists; it does not hold when the 
incident angle exceeds the critical angle for total internal reflection, discussed in 
section 3.5. In that situation, we must stick with (3.27). 



Example 3.2 

Show analytically for p-polarized light that R p + T p — 1, where R p is given by (3.22) 
and T p is given by (3.30). 



Solution: From (3.20) we have 



cos t sin t - cos 0; sin0j 
cos t sin t + cos 0j sin 0; 

cos 2 t sin 2 t - 2 cos 0; sin 0; cos t sin t + cos 2 0; sin 2 0; 
(cos0 t sin0 t + cos0i sin0i) 2 

have 



From (3.21) and (3.30) we 

7! t COS0 t 



T p = 



ftjCOS0i 

sin0;cos0 t 
sin0 t cos0; 



2cos0i sin0 t 
cos t sin t + cos 0; sin 0; 
4cos 2 0isin 2 t 
1 t sin t + cos 0; sin 0;) 2 



Miii>ti.uin/i L (COS0t Miict 

4 cos 0; sin0 t sin 0; cos t 
(cos0 t sin0 t + cos0;sin0i) 2 

t sin 2 t + 2 cos 0j sin 0; cos t sin t + cos 2 0; sin 2 0; 
(cos0 t sin0 t + cos0isin0i) 2 
(cos0 t sin0 t + cos0;sin0;) 2 
(cos0 t sin0 t + cos0jsin0i) 2 
= 1 



Then 



Rp+ Tp — 



cos 



3.4 Brewster's Angle 

Notice Tp and R p go to zero at a certain angle in Figs. 3.3 and 3.4, indicating that 
no p-polarized light is reflected at this angle. This behavior is quite general, as we 
can see from the second form of the Fresnel coefficient formula for r p in (3.20), 
which has tan {Q[ + d t ) in the denominator. Since the tangent 'blows up' at n/2, 
the reflection coefficient goes to zero when 



2 



(requirement for zero p-polarized reflection) (3.31) 



By inspecting Fig. 3.1, we see that this condition occurs when the reflected and 
transmitted wave vectors, k r and k t , are perpendicular to each other. If we insert 
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(3.31) into Snell's law (3.7), we can solve for the incident angle Oi that gives rise to 
this special circumstance: 



rii sin 6i = n t sin | — - 6\ j = n t cos 6[ 



(3.32) 



The angle that satisfies this equation, in terms of the refractive indices, is 
readily found to be 

B = tan -1 — (3.33) 

Hi 

We have replaced the specific 6\ with Ob in honor of Sir David Brewster who first 
discovered the phenomenon. The angle 6% is called Brewster's angle. At Brewster's 
angle, no p-polarized light reflects (see L 3.4). Physically, the p-polarized light 
cannot reflect because k r and k t are perpendicular. A reflection would require 
the microscopic dipoles at the surface of the second material to radiate along 
their axes, which they cannot do. Maxwell's equations 'know' about this, and so 
everything is nicely consistent. 

3.5 Total Internal Reflection 

From Snell's law (3.7), we can compute the transmitted angle in terms of the 
incident angle: 



6 t = sin 



- smOi 



(3.34) 



The angle 9 t is real only if the argument of the inverse sine is less than or equal to 
one. If ri\ > n t , we can find a critical angle at which the argument begins to exceed 
one: 

* nt 

(3.35) 



8 C = sin 1 



When 6\ > 6 C , then there is total internal reflection and we can directly show that 
R s = 1 and R p = 1 (see P3.9). 5 To demonstrate this, one computes the Fresnel 
coefficients (3.18) and (3.20) while employing the following substitutions: 

m 



sin0 t = — sin#i 



«t 



and 



cos£> t = U 



Vsin 2 0i-1 



(Snell's law) (3.36) 



idi>6 c ) (3.37) 



(see P0.19). 

In this case, t is a complex number. However, we do not assign geometrical 
significance to it in terms of any direction. Actually, we don't even need to know 
the value for 9 t ; we need only the values for sin0 t and cos0 t > as specified in (3.36) 
and (3.37). Even though sin0 t is greater than one and cos0 t is imaginary, we can 
use their values to compute r s , r p , t s , and t p . (Complex notation is wonderful!) 



Oscillatin; 
Dipole 




Figure 3.6 The intensity radiation 
pattern of an oscillating dipole as 
a function of angle. Note that the 
dipole does not radiate along the 
axis of oscillation, giving rise to 
Brewster's angle for reflection. 



3 M. Born and E. Wolf, Principles ofOptics, 7th ed., Sect. 1.5.4 (Cambridge University Press, 1999). 
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Figure 3.7 Animation of light 
waves incident on an interface 
both below and beyond the critical 
angle. 




-E, E, 



Figure 3.8 A wave experiencing 
total internal reflection creates an 
evanescent wave that propagates 
parallel to the interface. (The 
reflected wave is not shown.) 



Upon substitution of (3.36) and (3.37) into the Fresnel reflection coefficients 
(3.18) and (3.20) we obtain 



|cos0i-jj^sin 2 0i-l 



idi>6 c ) (3.38) 



^cos#i + i J^sin 2 0i- 1 



and 



cos0i-z'^W^sin 2 0i-l 



«?i>0 c ) (3.39) 



cos#i + i^-W^ sin 2 01-1 
n t V n i 



These Fresnel coefficients can be manipulated (see P3.9) into the forms 



r s = exp< -2/ tan 



«iCOS0i^ 



4sin 2 0i-l 



and 



-exp< 



-2/ tan 



n t cos 



sin 2 ft -1 



idi>6 c ) (3.40) 



(di>8 c ) (3.41) 



Each coefficient has a different phase (note n\ln t vs. n t /n[ in the expressions), 
which means that the s- and p-polarized fields experience different phase shifts 
upon reflection. Nevertheless, we definitely have I r s \ = 1 and |r p | = 1. We rightly 
conclude that 100% of the light reflects. The transmittance is zero as dictated by 
(3.27). We emphasize that one should not employ (3.29) or (3.30) in the case of 
total internal reflection, as the imaginary 9 t makes the geometric factor in this 
equation invalid. 

Even with zero transmittance, the boundary conditions from Maxwell's equa- 
tions (as worked out in appendix 3. A) require that the fields be non-zero on the 
transmitted side of the boundary, meaning t s ^ and t p # 0. While this situation 
may seem like a contradiction at first, it is an accurate description of what actually 
happens. The coefficients t s and t p characterize evanescent waves that exist on 
the transmitted side of the interface. The evanescent wave travels parallel to the 
interface so that no energy is conveyed away from the interface deeper into the 
medium on the transmission side. 

To compute the explicit form of the evanescent wave, 6 we plug (3.36) and 
(3.37) into the transmitted field (3.2): 



Et = [E { t p) (ycos0 t -zsin0 t ) +xE { t s) ] e i[hb^8 t+Z c 0S e t )-wt] 
( 



t n E- 



(pi 



v/\ ^-sin 2 0i- 1-z— sinfli +xf s £. (5 



k t z\ -tsin^i-l 



£ fc t yTr sin0i-(uf 



(3.42) 



6 G. R. Fowles, Introduction to Modern Optics, 2nd ed., Sect 2.9 (New York: Dover, 1975). 
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Figure 3.8 plots the evanescent wave described by (3.42) along with the associ- 
ated incident wave. The phase of the evanescent wave indicates that it propagates 
parallel to the boundary (in the y-dimension). Its strength decays exponentially 
away from the boundary (in the z-dimension). We leave the calculation of t s and 
t p as an exercise (P3.10). 



3.6 Reflections from Metal 



In this section we generalize our analysis to materials with complex refractive 
index JY = n + ix. As a reminder, the imaginary part of the index controls atten- 
uation of a wave as it propagates within a material. The real part of the index 
governs the oscillatory nature of the wave. It turns out that both the imaginary 
and real parts of the index strongly influence the reflection of light from a sur- 
face. The reader may be grateful that there is no need to re-derive the Fresnel 
coefficients (3.18)-(3.21) for the case of complex indices. The coefficients remain 
valid whether the index is real or complex - just replace the real index n with the 
complex index JY. However, we do need to be a bit careful when applying them. 

We restrict our discussion to reflections from a metallic or other absorbing 
material surface. As we found in the case of total internal reflection, we actually do 
not need to know the transmitted angle 9 t to employ Fresnel reflection coefficients 
(3.18) and (3.20). We need only acquire expressions for cos0 t and sin0 t , and we 
can obtain those from Snell's law (3.7). To minimize complications, we let the 
incident refractive index be ri[ = 1 (which is often the case) . Let the index on 
the transmitted side be written as JY t = JY. Then by Snell's law, the sine of the 
transmitted angle is 

sin0j 



sin6> t = 



JY 



(3.43) 



This expression is of course complex since JY is complex, which is just fine. 7 The 
cosine of the same angle is 



COS0 t 



1 - sin 2 t = — J JY 2 - sin 2 6i 
JY v 



(3.44) 



The positive sign in front of the square root is appropriate since it is clearly the 
right choice if the imaginary part of the index approaches zero. 

Upon substitution of these expressions, the Fresnel reflection coefficients 
(3.18) and (3.20) become 



cos0i- v^-sin 2 ^ 
cos#i + sj JY 2 - sin 2 6[ 



and 



V 'JY 2 - sin 2 6i - JY 2 cos 0; 



s/ JY 2 - sin 2 0i + JY 2 cos Q x 



(3.45) 



(3.46) 



7 See M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 14.2 (Cambridge University Press, 
1999). 
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Figure 3.9 The reflectances (top) 
with associated phases (bottom) 
for silver, which has index n — 0.13 
and k — 4.05. Note the minimum 
of R p corresponding to a kind of 
Brewster's angle. 
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These expressions are tedious to evaluate. When evaluating the expressions, it is 
usually desirable to put them into the form 

r s =\r s \e i ^ (3.47) 

and 

r p = \r p \e i ^ (3.48) 

We refrain from putting (3.45) and (3.46) into this form using the general ex- 
pressions; we would get a big mess. It is a good idea to let your calculator or 
a computer do it after a specific value for J{ = n + z'k is chosen. An important 
point to notice is that the phases upon reflection can be very different for s and 
p-polarization components (i.e. <p p and <p s can be very different). This is true in 
general, even when the reflectivity is high (i.e. | r s | and I r p I on the order of unity) . 

Brewster's angle exists also for surfaces with complex refractive index. How- 
ever, in general the expressions (3.46) and (3.48) do not go to zero at any incident 
angle 6\. Rather, the reflection of p-polarized light can go through a minimum at 
some angle 9i, which we refer to as Brewster's angle (see Fig. 3.9). This minimum 
is best found numerically since the general expression for \r p \ in terms of n and k 
and as a function of 6 Y can be unwieldy. 



Appendix 3. A Boundary Conditions For Fields at an Inter- 
face 





. = ^ 

S i 






^ 3 i 











Figure 3.10 Interface of two mate- 
rials. 



We are interested in the continuity of fields across a boundary from one medium 
with index n\ to another medium with index n 2 . We will show that the compo- 
nents of electric field and the magnetic field parallel to the interface surface must 
be the same on either side (adjacent to the interface). This result is independent 
of the refractive index of the materials; in the case of the magnetic field we assume 
the permeability p is the same on both sides. To derive the boundary conditions, 
we consider a surface S (a rectangle) that is perpendicular to the interface between 
the two media and which extends into both media, as depicted in Fig. 3.10. 
First we examine the integral form of Faraday's law (1.14) 



I E-d£ = -— [ B n 
Jc dtJs 



da 



(3.49) 



applied to the rectangular contour depicted in Fig. 3.10. We perform the path 
integration on the left-hand side around the loop as follows: 

^E- d£ = E m d- E lL £ y - E 2 l£ 2 -E 2 \\d + E 2 l£ 2 + E lL £i = {E E 2 \\) d (3.50) 

Here, E \\\ refers to the component of the electric field in the material with index 
n\ that is parallel to the interface. E\±_ refers to the component of the electric field 
in the material with index n\ which is perpendicular to the interface. Similarly, 
£"2|| and £21 are the parallel and perpendicular components of the electric field 
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in the material with index n 2 . We have assumed that the rectangle is small enough 
that the fields are uniform within the half rectangle on either side of the boundary. 

Next, we shrink the loop down until it has zero surface area by letting the 
lengths £\ and £ 2 go to zero. In this situation, the right-hand side of Faraday's law 
(3.49) goes to zero 

j~R-nda^0 (3.51) 

s 

and we are left with 

fill =£'211 (3.52) 

This simple relation is a general boundary condition, which is met at any material 
interface. The component of the electric field that lies in the plane of the interface 
must be the same on both sides of the interface. 

We now derive a similar boundary condition for the magnetic field using the 
integral form of Ampere's law: 8 

^B-d^ = jLi J|j + e ^j-nda (3.53) 

c s 

As before, we are able to perform the path integration on the left-hand side for 
the geometry depicted in the figure, which gives 

^B- d£ = B 1N d- B 1± £i- B 2L £ 2 - B 2 \\d + B 2L £ 2 + = {B n -B 2 \\)d (3.54) 

The notation for parallel and perpendicular components on either side of the 
interface is similar to that used in (3.50). 

Again, we can shrink the loop down until it has zero surface area by letting the 
lengths £\ and £ 2 go to zero. In this situation, the right-hand side of (3.53) goes to 
zero (ignoring the possibility of surface currents): 

J + £- — -nda— (3.55) 



s 



and we are left with 



B in =B 211 (3.56) 



This is a general boundary condition that must be satisfied at the material inter- 
face. 



This form can be obtained from (1.4) by integration over the surface S in Fig. 3.10 and applying 
Stokes' theorem (0.12) to the magnetic field term. 
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Exercises 



Exercises for 3.2 The Fresnel Coefficients 
P3.1 Derive the Fresnel coefficients (3.20) and (3.21) for p-polarized light. 

P3.2 Verify that each of the alternative forms given in (3.18)-(3.21) are equiv- 
alent (given Snell's law). Show that at normal incidence (i.e. 6[ = d t = 0) 
the Fresnel coefficients reduce to 

lim r s = lim r„ = — - — — and lim t s = lim t„ = — 

0i^o ft— o p n t + rii ft— o ft— o H n t + n,i 

P3.3 Undoubtedly the most important interface in optics is when air meets 
glass. Use a computer to make the following plots for this interface as a 
function of the incident angle. Use n\ = 1 for air and n t = 1.6 for glass. 
Explicitly label Brewster's angle on all of the applicable graphs. 

(a) r p and t p (plot together on same graph) 

(b) R p and T p (plot together on same graph) 

(c) r s and t s (plot together on same graph) 

(d) R s and T s (plot together on same graph) 



Exercises for 3.3 Reflectance and Transmittance 

L3.4 (a) In the laboratory measure the reflectance for both s and p polarized 
light from a flat glass surface at about ten points. You can normalize 
the detector by placing it in the incident beam of light before the glass 
surface. Especially watch for Brewster's angle (described in section 3.4). 
Figure 3.11 illustrates the experimental setup, (video) 

High sensitivity 
detector 

I I O 

• • • 



\ / 
\ / 

\ / 

\ / Uncoated glass 

Polarizer . , V— l° n rotation stage 





- 




Laser 









Figure 3.11 Experimental setup for lab 3.4. 

(b) Use a computer to calculate the theoretical air-to-glass reflectance 
as a function of incident angle (i.e. plot R s and R p as a function of 0\). 
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Take the index of refraction for glass to be n t = 1.54 and the index for air 
to be one. Plot this theoretical calculation as a smooth line on a graph. 
Plot your experimental data from (a) as points on this same graph (not 
points connected by lines). 

P3.5 A pentaprism is a five-sided reflecting prism used to deviate a beam of 
light by 90° without inverting an image (see Fig. 3.12). Pentaprisms are 
used in the viewfinders of SLR cameras. 

(a) What prism angle /3 is required for a normal-incidence beam from 
the left to exit the bottom surface at normal incidence? 

(b) If all interfaces of the pentaprism are uncoated glass with index 
n = 1.5, what fraction of the intensity would get through this system for 
a normal incidence beam? Compute for p-polarized light, and include 
transmission through the first and final surfaces as well as reflection at 
the two interior surfaces. 

NOTE: The transmission you calculate will be very poor. The reflect- 
ing surfaces on pentaprisms are usually treated with a high-reflection 
coating and the transmitting surfaces are treated with anti-reflection 
coatings. 

P3.6 Show analytically for s-polarized light that R s + T s = 1, where R s is 
given by (3.22) and T s is given by (3.30). 

Exercises for 3.4 Brewster's Angle 
P3.7 Find Brewster's angle for glass n = 1.5. 




Exercises for 3.5 Total Internal Reflection 



P3.8 Diamonds have an index of refraction of n = 2.42 which allows total in- 
ternal reflection to occur at relatively shallow angles of incidence. Gem 
cutters choose facet angles that ensure most of the light entering the 
top of the diamond will reflect back out to give the stone its expensive 
sparkle. One such cut, the "Eulitz Brilliant" cut, is shown in Fig. 3.13. 

(a) What is the critical angle for diamond? 

(b) One way to spot fake diamonds is by noticing reduced brilliance in 
the sparkle. What fraction of p-polarized light (intensity) would make 
it from point A to point B in the diagram for a diamond? If a piece 
of fused quartz (n = 1.46) was cut in the Eulitz Brilliant shape, what 
fraction of p-polarized light (intensity) would make it from point A to 
point B in the diagram? 




Figure 3.13 A Eulitz Brilliant cut 
diamond. 
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(c) What is the phase shift due to reflection for s-polarized light at the 
first internal reflection depicted in the figure (incident angle 40.5°) in 
diamond? What is the phase shift in fused quartz? 

P3.9 Derive (3.40) and (3.41) and show that R s = 1 and R p = 1. HINT: See 
problem P0.15. 

P3.10 Compute t s and t p in the case of total internal reflection. Put your 
answer in polar form (i.e. t= \ t\e"^). 

P3.ll Use a computer to plot the air-to-water transmittance as a function 
of incident angle (i.e. plot (3.27) as a function of 6\). Also plot the 
water-to-air transmittance on a separate graph. Plot both T s and T p on 
each graph. The index of refraction for water is n = 1.33. Take the index 
of air to be one. 

P3.12 Light (A vac = 500 nm) reflects internally from a glass surface [n = 1.5) 
surrounded by air. The incident angle is 6\ = 45° . An evanescent wave 
travels parallel to the surface on the air side. At what distance from the 
surface is the amplitude of the evanescent wave 1 / e of its value at the 
surface? 




Figure 3.14 Geometry for P3.15 



Exercises for 3.6 Reflections from Metal 

P3.13 Using a computer, plot \r s \, \r p \ versus Q[ for silver (n = 0.13 and k = 
4.05). Make a separate plot of the phases cf> s and (p p from (3.47) and 
(3.48). Clearly label each plot, and comment on how the phase shifts 
are different from those experienced when reflecting from glass. 9 

P3.14 Find Brewster's angle for silver [n = 0.13 and k = 4.0) by calculating R p 
and finding its minimum. You will want to use a computer program to 
do this (Matlab, Maple, Mathematica, etc.). 

P3.15 The complex index for silver is given by n = 0.13 and k = 4.0. Find r s 
and r p when reflecting from vacuum (n = 1, K = 0) at 8[ = 80° and put 
them into the forms (3.47) and (3.48). 



'Are you surprised that the real part of the index can be less than one? 



Chapter 4 

Multiple Parallel Interfaces 



In chapter 3, we studied the transmission and reflection of light at a single in- 
terface between two (isotropic homogeneous) materials with indices n x and n t . 
We found that the percent of light reflected versus transmitted depends on the 
incident angle and on whether the light is s- or p-polarized. The Fresnel coef- 
ficients r s , t s , r p , t p (3.18)-(3.21) connect the reflected and transmitted fields to 
the incident field, , depending on the polarization of the incident light . Similarly, 
either R s and T s or R p and T p determine the fraction of incident power that either 
reflects or transmits (see (3.25) and (3.27)). 

In this chapter we consider the overall transmission and reflection through 
multiple parallel interfaces. We start with a two-interface system, where a layer of 
material is inserted between the initial and final materials. This situation occurs 
frequently in optics. For example, lenses are often coated with a thin layer of 
material in an effort to reduce reflections. Metal mirrors usually have a thin oxide 
layer or a protective coating between the metal and the air. We can develop 
reflection and transmission coefficients r tot and f tot , which apply to the overall 
double-boundary system, similar to the Fresnel coefficients for a single boundary. 
Likewise, we can compute an overall reflectance and transmittance R tot and T tot . 
These can be used to compute the 'tunneling' of evanescent waves across a gap 
between two parallel surfaces when the critical angle for total internal reflection 
is exceeded. 

The formalism we develop for the double-boundary problem is useful for 
describing a simple instrument called a Fabry-Perot etalon (or interferometer if 
the instrument has the capability of variable spacing between the two surfaces). 
Such an instrument, which is constructed from two partially reflective parallel 
surfaces, is useful for distinguishing closely spaced wavelengths. 

Finally, in this chapter we will extend our analysis to multilayer coatings, 
where an arbitrary number of interfaces exist between many material layers. 
Multilayers are often used to make highly reflective mirror coatings from dielectric 
materials (as opposed to metallic materials). Such mirror coatings can reflect 
with efficiencies greater than 99.9% at certain wavelengths. In contrast, metallic 
mirrors typically reflect with ~ 96% efficiency, which can be a significant loss 
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if there are many mirrors in an optical system. Dielectric multilayer coatings 
also have the advantage of being more durable and less prone to damage from 
high-intensity lasers. 

4.1 Double-Interface Problem Solved Using Fresnel Coef- 
ficients 

Consider a slab of material sandwiched between two other materials as depicted 
in Fig. 4.1. Because there are multiple reflections inside the middle layer, we have 
dropped the subscripts i, r, and t used in chapter 3 and instead use the symbols 
-> and <- to indicate forward and backward traveling waves, respectively. Let ri\ 
stand for the refractive index of the middle layer. For consistency with notation 
that we will later use for many- layer systems, let no and ri2 represent the indices 
of the other two regions. For simplicity, we assume that indices are real. As with 
the single-boundary problem, we are interested in finding the overall transmitted 
fields E ( 2 S \ and Efl and the overall reflected fields E^l and E&_ in terms of the 
incident fields and E^. 

Both forward and backward traveling plane waves exist in the middle region. 
Our intuition rightly tells us that in this region there are many reflections, bounc- 
ing both forward and backward between the two surfaces. It might therefore seem 
that we need to keep track of an infinite number of plane waves, each correspond- 
ing to a different number of bounces. Fortunately, the many forward-traveling 
plane waves all travel in the same direction. Similarly, the backward-traveling 
plane waves are all parallel. These plane-wave fields then join neatly into a single 
net forward-moving and a single net backward-moving plane wave within the 

y-axis 




z = z = d 

Figure 4.1 Waves propagating through a dual interface between materials. 
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middle region. 1 

As of yet, we do not know the amplitudes or phases of the net forward and net 
backward traveling plane waves in the middle layer. We denote them by E[ s l and 
Efl or by Efl and E[ P l , separated into their 5 and p components as usual. Similarly 
E { s l and as well as and Efl are understood to include light that 'leaks' 
through the boundaries from the middle region. Thus, we need only concern 
ourselves with the five plane waves depicted in Fig. 4.1. 

The various plane-wave fields are connected to each other at the boundaries 
via the single-boundary Fresnel coefficients (3.18)-(3.21). At the first surface we 
define 

l _ sinfli cosflo - sinflo cosf?i _^ _ cosSi sinf?i - cos#o sin0o 

sin0icos0o + sin0ocos0i p cos0isinf?i + cosf?osin0o ^ ^ 

p^j _ 2sin0icos6>o _ 2cosf?osin0i 

s sin0icos0o + sin0ocos0i p cos0isinf?i + cos0osinf?o 

The notation 0^1 indicates the first surface from the perspective of starting 
on the incident side and propagating towards the middle layer. The Fresnel 
coefficients for the backward traveling light approaching the first interface from 
within the middle layer are given by 

1 s s p p 

j _ 2sin0ocos0i ^ _ 2cos0isinf?o ^-^) 

sin0ocos0i + sin0icos0o v cos0osinf?o + cos0i sinSi 

where 1^0 again indicates connections at the first interface, but from the per- 
spective of beginning inside the middle layer. Finally, the single-boundary coeffi- 
cients for light approaching the second interface are 

^2 _ sin 02 cos0i - sin$i cosf?2 i_ 2 _ cos $2 sin#2 _ cosf?i sin^i 

sin02cos0i + sin0icos02 v cos02sin#2 + cos0i sin0i 

t 2 _ 2sin02cos0i l _ >2 _ 2cos0isin£>2 

s sin02cos0i + sin0icos02 p cos02sin02 + cos0isinf?i 

In a similar fashion, the notation 1^2 indicates connections made at the second 
interface from the perspective of beginning in the middle layer. 

To solve for the connections between the five fields depicted in Fig.4.1, we 
will need four equations for either s or p polarization (taking the incident field as 
a given). To simplify things, we will consider s-polarized light in the upcoming 
analysis. The equations for p-polarized light look exactly the same; just replace 
the subscript s with p. Through the remainder of this section and the next, we 
will continue to economize by writing the equations only for s-polarized light 
with the understanding that they apply equally well to p-polarized light. 



lr rhe sum of parallel plane waves Y,Eje^ ~ <ot \ where the phase of each wave is contained in 

j 

Ey , can be written as (£ E j ) e 1 ' kr ~ w f ' , which is effectively one plane wave. 

i 
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The forward-traveling wave in the middle region arises from both a transmis- 
sion of the incident wave and a reflection of the backward-traveling wave in the 
middle region at the first interface. Using the Fresnel coefficients, we can write 

as the sum of fields arising from and Efl as follows: 

E[t = f^E® + r^ E[i (4.4) 

The factor f^ 1 and r]^° are the single-boundary Fresnel coefficients selected 
appropriately from (4.1). Similarly, the overall reflected field E^_, is given by the 
reflection of the incident field and the transmission of the backward-traveling 
field in the middle region according to 

E<* = r? l E<* + t^°E[i (4.5) 

Two connections done; two to go. 

Before we continue, we need to specify an origin so that we can calculate 
phase shifts associated with propagation in the middle region. Propagation was 
not an issue in the single-boundary problem studied back in chapter 3. However, 
in the double-boundary problem, the thickness of the middle region dictates 
phase variations that strongly influence the result. We take the origin to be 
located on the first interface, as shown in Fig. 4.1. Since all fields in (4.4) and (4.5) 
are evaluated at the origin (y, z) = (0, 0), there are no phase factors needed. 

We will connect the plane-wave fields across the second interface at the point 
r = id. The appropriate phase-adjusted 2 field at (y, z) = (0,d) is E[^e l = 

e ikldcosBl , since is the field at the origin (y,z) = (0,0). The transmitted 
field in the final medium arises only from the forward-traveling field in the middle 
region, and at our selected point it is 

E* = t ^ 2 E[ie ikldcosei (4.6) 

Note that E' 2 s l stand for the transmitted field at the point (y, z) = (0, d); its local 
phase can be built into its definition so no need to write an explicit phase. 

The backward-traveling plane wave in the middle region arises from the 
reflection of the forward-traveling plane wave in that region: 



-ikidcosBi _ _i->2p(5] ifcidcosfli 



(4.7) 



Like before, E[t is referenced to the origin (y, z) = (0, 0) . Therefore, the factor 
giki^ r _ e -ik 1 dcose l is needed at (y ;Z ) _ {0) d). 

The relations (4.4)-(4.7) permit us to find overall transmission and reflection 
coefficients for the two-interface problem. 

Example 4.1 

Derive the transmission coefficient that connects the final transmitted field to the 
incident field for the double-interface problem according to r^ 01 = E^, lEf^ . 



"In the middle region, = k\ (ysinfli +zcosfli) and ki^ = k\ (ysinfli -zcosfli). 
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Solution: From (4.6) we may write 



F w = e 

I* 



-ikidcosSi 



Substitution of this into (4.7) gives 



(s) s „ikidcos9i 



(4.8) 



(4.9) 



Next, substituting both (4.8) and (4.9) into (4.4) yields the connection we seek 
between the incident and transmitted fields: 



is) -l-f2 



2- ,1_2 



After rearranging, we arrive at the more useful form 



JBt _ E 2- 



i _ j-i-o r i^2 e 2i'fcidcos9i 



(4.10) 



(4.11) (p can be switched for s) 



The coefficient 4 0t derived in Example 4. 1 connects the amplitude and phase 
of the incident field to the amplitude and phase of the transmitted field in a 
manner similar to the single-boundary Fresnel coefficients. The numerator of 
(4.11) reminds us of the physics of the situation: the field transmits through the 
first interface, acquires a phase due to propagating through the middle layer, and 
transmits through the second interface. The denominator of (4.11) modifies the 
result to account for feedback from multiple reflections in the middle region. 3 

The overall reflection coefficient is found to be (see P4.1) 



tot _ ■ c o<- 

s ~ 



= r c 



1- 



+o r i->2gi'2fc 1 dcos0 1 



(4.12) {p can be switched for s) 



Again the equation reminds us of the basic physics, and we did not completely 
simplify the expression to make this more apparent. There is an initial reflection 
from the first interface. That light is joined by light that transmits through the first 
interface (looking at only the numerator of the second term), propagates through 
the middle layer, reflects from the second interface, propagates back through the 
middle layer, and transmits back through the first interface. The denominator of 
the second term accounts for the effects of multiple-reflection feedback. 



4.2 Two -Interface Transmittance at Sub Critical Angles 

Often we are interested in the intensity of the light that goes through or reflects 
from the double-interface setup. Because the transmission coefficient (4.11) has 

3 Our derivation method avoids the need for explicit accounting of multiple reflections. For an 
alternative approach arriving at the same result via an infinite geometric series, see M. Born and E. 
Wolf, Principles of Optics, 7th ed., Sect. 7.6.1 (Cambridge University Press, 1999) or G. R. Fowles, 
Introduction to Modern Optics, 2nd ed., Sect 4.1 (New York: Dover, 1975). 
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(p can be switched for s) 



a simpler form than the reflection coefficient (4.12), it will be easier to calcu- 
late the total transmittance T s tot and obtain the reflectance, if desired, from the 
relationship 

(4.13) 



When the transmitted angle 02 is real, we may write the fraction of the transmitted 
power as in (3.30): 



n 2 cos0 2 I tot 1 2 
no cos 0o s 

H2COS02 



(0 2 real) (4.14) 



Hq COS 0q \g-ihdcos6i 



[p can be switched for s) 



(Before squaring, we multiplied the top and bottom of (4.11) by e -' k i dcos 8i to 
make the denominator more symmetric for later convenience.) Equation (4.14) 
remains valid even if the angle 0i is complex. Thus, it can be applied to the case of 
evanescent waves 'tunneling' through a gap where 0o lies beyond the critical angle 
for total internal reflection from the middle layer. This will be studied further in 
section 4.3. 

When there are no evanescent waves in any of the regions (i.e. 0o and 6\ both 
do not exceed critical angle) we can simplify (4.14) into the following useful form 
(see P4.3): 4 

j^max 



r tot = 



1 + F s sin 2 (f) 



where 



»1 rpl- 

1 s 



^0 Dl->2 

l s K s 



(0! and 2 real) (4.15) 



(4.16) 



■ 2fci<icos0i + 6 r i^a + 8 r 



and 



F s = 



(4.17) 



(4.18) 



The quantity is the maximum possible transmittance of power through the 
two surfaces. The single-interface transmittances {T°^ 1 and T^ 2 ) and reflectances 
(i?s~*° and i?j" 2 ) are calculated from the single-interface Fresnel coefficients in 
the usual way as described in chapter 3. The numerator of r s max represents the 
combined transmittances for the two interfaces without considering feedback 
due to multiple reflections. The denominator enhances this value to account for 
reinforcing feedback in the middle layer. 

The phase delay experienced by the plane wave in the middle region is de- 
scribed by <D. The term 2kidcos9\ represents the phase delay acquired during 
round-trip propagation in the middle region. The terms <5 r i^o and 8 r i-*2 account 



M. Born and E. Wolf, Principles ofOptics, 7th ed., Sect. 7.6.1 (Cambridge University Press, 1999). 
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for possible phase shifts upon reflection from each interface. They are defined 
indirectly from the single-boundary Fresnel reflection coefficients: 



' s I ' s 



i-oi ^ri- 



and 



' s |'s i 



i<5 .1-.2 



(4.19) 



If all the indices in the double-boundary system are real, then 5 r i-vo and 5 r i->2 
can only be zero or n (i.e. the coefficients can only be positive or negative real 
numbers). 

F s is called the coefficient of finesse (not to be confused with reflecting finesse 
defined in section 4.6), which determines how strongly the transmittance is 
influenced when $ is varied (for example, through varying d or the wavelength 

^vac) ■ 



Example 4.2 

Consider a 'beam splitter' designed for s-polarized light incident on a substrate of 
glass [n = 1.5) at 45° as shown in Fig. 4.2. A thin coating of zinc sulfide {n = 2.32) 
is applied to the front of the glass to cause about half of the light to reflect. A 
magnesium fluoride [n = 1.38) coating is applied to the back surface of the glass to 
minimize reflections at that surface. 5 Each coating constitutes a separate double- 
interface problem. The front coating is deferred to problem P4.5. In this example, 
find the highest transmittance possible through the antireflection film at the back 
of the 'beam splitter' and the smallest possible d,2 that accomplishes this for light 
with wavelength A vac = 633 nm. 

Solution: For the back coating, we have no = 1.5, n\ - 1.38, and n2-\. We can 
find 0q an d 6\ from 02 = 45° using Snell's law 



n\ sin 0i = sin 02 



no sin 0o = sin 02 



Jsin45°\ 
h^sin" 1 ! : — I =30.82° 



Jo = sin 



I 1.38 

sin 45° 
1.5 



= 28.13° 



Next we calculate the single-boundary Fresnel coefficients: 

^ 2 sin (0i -0 2 ) sin (30.82° -45°) „ niro 

r c = = = 0.253 

5 sin (0i + 2 ) sin (30.82° + 45°) 



Partial 




reflection 




coating 




\ 46% 




45\ 


54%^j 


45^ 






Glass 



Anti-reflection 
coating 



do 



Figure 4.2 Side view of a beam- 
splitter. 



, „ sin(0i-0 o ) sin (30.82° -28.13°) 
r 1 ^ = 1 = -o 0549 

s sin(0i + o ) sin (30.82° + 28.13°) 
These coefficients give us the phase shift due to reflection 

5 r i^o = 7i , 8,1^2 = 

1 s 't 



We ignore possible feedback between the front and rear coatings. Since the antireflection 
films are usually imperfect, beam splitter substrates are often slightly wedged so that unwanted 
reflections from the second surface travel in a different direction. 



96 



Chapter 4 Multiple Parallel Interfaces 




Figure 4.3 Animation showing 
frustrated total internal reflection. 



The single-boundary reflectances are given by 



J? 1 ^ = |r^°| 2 : 



-0.0549| z = 0.0030 



R 



and the transmittances are 



I 2 = |0.253| 2 = 0.0640 



>o->i . 
s 



1 - i?j z = 1 - 0.0640 = 0.936 
Finally, we calculate the coefficient of finesse 

4^(0.0030) (0.0640) 



4v^pi?F 



(l-v/flP^Pj (W (0.0030) (0.0640)) 



: 0.0570 



and the maximum transmittance 



(0.997) (0.936) 



(l-y^pflp) (1-^(0.0030) (0.0640)) 
Putting everything together, we have 

0.960 



: 0.960 



1 + 



0.0570sin 2 ( 2fcld2C 2 osei+7t ) 



The maximum transmittance occurs when the sine is zero. In that case, T^ 01 = 
0.960, meaning that 96% of the light is transmitted. We find the thickness by setting 
the argument of the sine to n 



2fcid2cos0i + n — 2n 



Since k\ = 2nn\ /A vac , we have 

j Avar 

d 2 



633 nm 



4nicos0i 4(1.38) cos30. 82° 



134 nm 



Without the coating, (i.e. d,2 — 0), the transmittance through the antireflection 
coating would be 0.908, so the coating does give an improvement. 



4.3 Beyond Critical Angle: Tunneling of Evanescent Waves 

When 0o is greater than the critical angle, an evanescent wave forms in the middle 
region. In this case the formula (4.15) for the transmittance cannot be used. 
However, the formula (4.14) still holds if the angle 62 is real (i.e. if the critical 
angle in the absence of the middle layer is not exceeded). Thus, we can use (4.14) 
to describe frustrated total internal reflection. In this case an evanescent wave 
occurs in the middle region, and if the second surface is sufficiently close to the 
first surface, the evanescent wave stimulates the second surface to produce a 
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transmitted wave propagating at an angle 02. This behavior is sometimes referred 
to as 'tunneling'. 

We do not need to deal directly with the complex angle 0\ . Rather, we just need 
sin$i and cos0i in order to calculate the single-boundary Fresnel coefficients. 
From Snell's law we have 



no ri2 
sin 0i = — sin 0o = — sin 62 
n\ ri\ 



(4.20) 



and for the middle layer we write 



cos 0i = iy sin Si - 1 



(4.21) 



Note that beyond the critical angle, sin0i is greater than one. We illustrate how to 
apply (4.14) via a specific example: 

Example 4.3 

Calculate the transmittance of p-polarized light through the region between two 
closely spaced 45° right prisms, as shown in Fig. 4.4, as a function of A vac and 
the prism spacing d. Take the index of refraction of the prisms to be n — 1.5 
surrounded by index n = l, and use 0o = 02 = 45°. Neglect possible reflections 
from the exterior surfaces of the prisms. 

Solution: From (4.20) and (4.21) we have 

sin0i = 1.5sin45° = 1.061 and cos0i = zVl.061 2 - 1 = z'0.3536 

We must compute various expressions involving Fresnel coefficients that appear 
in (4.14): 

2 cos 0o sin 0i 



cos0i sin0i + cos0o sin0o 
2cos0i sin02 



2^(1.061) 



cos02 sin02 + cos0i sin0i 



(iO.3536) (1.061) + ^^ 

2 (z'0.3536) 4= 

v2 



= 5.76 



= 0.640 



ii + (i0.3536) (1.061) 

cos0isin0 1 -cos0 o sin0 o _ ('0.3536) (1.061) - ^ ^ ^ 
- ~ ~~ i T ~ e 



p cos0isin0i + cos0 o sin0 o ~ (z0.3536) (1.061) + X X 

v2 v2 

For the last step in the r^ 2 calculation, see problem P0.15. Also note that r 1 ^ 2 = 

rp~*° = -r""* 1 since no = ri2- We also need 



In 2n ( d \ 

fcidcos0i = - — d cos 0i = - — d (z'0.3536) = z2.22 - — 

'Wac 'Wac v 'Wac / 




Figure 4.4 Frustrated total internal 
reflection in two prisms. 
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0.5 1 1.5 2 



d I A vac 
Figure 4.5 Plot of (4.22) 
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working on a doctoral dissertation on 
interference phenomona. After com- 
pleting his doctorate, he began working 
as a lecturer and laboratory assistant 
at the University of Marseille where a 
decade later he was appointed a pro- 
fessor of physics. Soon after his arrival 
to the University of Marseille, Fabry 
began a long and fruitful collaboration 
with Alfred Perot (1863-1925). Fabry 
focused on theoretical analysis and mea- 
surements while his colleague did the 
design work and construction of their 
new interferometer, which they continu- 
ally improved over the years. During his 
career, Fabry made significant contribu- 
tions to spectroscopy and astrophysics 
and is credited with co-discovery of the 
ozone layer. See J. F. Mulligan, "Who 
were Fabry and Perot?," Am. J. Phys. 
66. 797-802 (1998). 



We are now ready to compute the total transmittance (4.14). The factors out in 
front vanish since f3n = 82 and no - ri2, and we have 
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Figure 4.5 shows a plot of the transmittance (4.22) calculated in Example 4.3. 
Notice that the transmittance is 100% when the two prisms are brought together 
as expected (T^idl A vac = 0) = 1). When the prisms are about a wavelength apart, 
the transmittance is significantly reduced, and as the distance gets large compared 
to a wavelength, the transmittance quickly goes to zero [T^ x {dl 'A va c » 1) ~ 0). 



4.4 Fabry-Perot 

In the 1890s, Charles Fabry realized that a double interface could be used to 
distinguish wavelengths of light that are very close together. He and a talented 
experimentalist colleague, Alfred Perot, constructed an instrument and began 
to use it to make measurements on various spectral sources. The Fabry-Perot 
instrument 6 consists of two identical (parallel) surfaces separated by spacing d. 
We can use our analysis in section 4.2 to describe this instrument. For simplicity, 
we choose the refractive index before the initial surface and after the final surface 
to be the same (i.e. no = ni). We assume that the transmission angles are such 
that total internal reflection is avoided. The transmission through the device 
depends on the exact spacing between the two surfaces, the reflectivity of the 
surfaces, as well as on the wavelength of the light. 

If the spacing d separating the two parallel surfaces is adjustable (scanned), 
the instrument is called a Fabry-Perot interferometer. If the spacing is fixed while 
the angle of the incident light is varied, the instrument is called a Fabry-Perot 
etalon. An etalon can therefore be as simple as a piece of glass with parallel 

6 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.6.2 (Cambridge University Press, 1999) . 
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surfaces. Sometimes, a thin optical membrane called a pellicle is used as an 
etalon (occasionally inserted into laser cavities to discriminate against certain 
wavelengths). However, to achieve sharp discrimination between closely-spaced 
wavelengths, a large spacing d is desirable. 

As we previously derived (4.11), the transmittance through a double boundary 

is 

j-max 

(4.23) 



1 + Fsin 2 (f) 



In the case of identical interfaces, the transmittance and reflectance coefficients 
are the same at each surface (i.e. T = T " 1 = T 1 " 2 and R = R 1 ^ = R 1 ^ 2 ). In this 
case, the maximum transmittance and the finesse coefficient simplify to 

T 2 



and 



F = 



a-RY 

4R 



(4.24) 



(4.25) 



a-Rr 

In principle, these equations should be evaluated for either s- or p-polarized light. 
However, a Fabry-Perot interferometer or etalon is usually operated near normal 
incidence so that there is little difference between the two polarizations. 

When using a Fabry-Perot instrument, one observes the transmittance T tot as 
the parameter $ is varied. The parameter $ can be varied by altering d, 6\, or A 
as prescribed by 

4nn-\ d 

■COS01+5,. (4.26) 



To increase the sensitivity of the instrument, it is desirable to have the transmit- 
tance T tot vary strongly when O is varied. By inspection of (4.23), we see that T tot 
varies strongest if the finesse coefficient F is large. We achieve a large finesse 
coefficient by increasing the reflectance R. 

The basic setup of a Fabry-Perot instrument is shown in Fig. 4.6. In order to 
achieve a relatively high reflectivity R (and therefore large F), special coatings can 
be applied to the surfaces, for example, a thin layer of silver to achieve a partial 
reflection, say 90%. Typically, two glass substrates are separated by distance d, 
with the coated surfaces facing each other as shown in the figure. The substrates 
are aligned so that the interior surfaces are parallel to each other. It is typical for 
each substrate to be slightly wedge-shaped so that unwanted reflections from the 
outer surfaces do not interfere with the double boundary situation between the 
two plates. 

Technically, each coating constitutes its own double-boundary problem (or 
multiple-boundary as the case may be). We can ignore this detail and simply 
think of the overall setup as a single two-interface problem. Regardless of the 
details of the coatings, we can say that each coating has a certain reflectance R 
and transmittance T . However, as light goes through a coating, it can also be 
attenuated because of absorption. In this case, we have 




Jean Baptiste Alfred Perot (1863- 
1925, French) was born in Metz, France. 
He attended the Ecole Polytechnique 
and then the University of Paris, where 
he earned a doctorate in 1888. He be- 
came a professor in in Marseille in 1894 
where he began his collaboration with 
Fabry. Perot contributed his consider- 
able talent of instrument fabrication to 
the endeavor. Perot spent much of his 
later career making precision astronom- 
ical and solar measurements. See J. F. 
Mulligan, "Who were Fabry and Perot?," 
Am. J. Phys. 66. 797-802 (19 




Figure 4.6 Typical Fabry-Perot 
setup. If the spacing d is variable, 
it is called an interferometer; oth- 
erwise, it is called an etalon. 



R+ T+A= 1 



(4.27) 
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Figure 4.7 Transmittance as the 
phase <P is varied. The different 
curves correspond to different 
values of the finesse coefficient. 
<J>o represents a large multiple of 
In. 
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Trig Sig 
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Figure 4.8 Setup for a Fabry- Perot 
interferometer. 




Figure 4.9 Transmittance as the 
separation d is varied (F = 100). 
d represents a large distance for 
which <1> is a multiple of 2n. 



where A represents the amount of light absorbed at a coating. The attenuation A 
reduces the amount of light that makes it through the instrument, but it does not 
impact the nature of the interferences within the instrument. 

The total transmittance T tot (4.23) through an ideal Fabry- Perot instrument is 
depicted in Fig. 4.7 as a function of <&. The various curves correspond to different 
values of F. Typical values of <5 can be extremely large. For example, suppose 
that the instrument is used at near-normal incidence (i.e. cos(9i = 1) with a 
wavelength of A vac = 500 nm and an interface separation of d = 1 cm. From 
(4.26) the value of <5 (ignoring the constant phase terms S r ) is approximately 



On = 



4jt(1 cm) 
500 nm 



= 80,0007r 



As we vary d, A, or 6\ by small amounts, we can easily cause <S to change by 2n as 
depicted in Fig. 4.7. The figure shows small changes in <E> above a value <Sn, which 
represents a large multiple of 2n. 

The reflection phase 8 r in (4.26) depends on the exact nature of the coatings 
in the Fabry- Perot instrument. However, we do not need to know the value of 5 r 
(depending on both the complex index of the coating material and its thickness) . 
Whatever the value of <5 r , we only care that it is constant. Experimentally, we can 
always compensate for the <5 r by 'tweaking' the spacing d, whose exact value is 
likely not controlled for in the first place. Note that the required 'tweak' on the 
spacing need only be a fraction of a wavelength, which is typically tiny compared 
to the overall spacing d. 

4.5 Setup of a Fabry- Perot Instrument 

Figure 4.8 shows the typical experimental setup for a Fabry-Perot interferometer. 
A collimated beam of light is sent through the instrument. The beam is aligned so 
that it is normal to the surfaces. It is critical for the two surfaces of the interferom- 
eter to be extremely close to parallel. When aligned correctly, the transmission 
of a collimated beam will 'blink' all together as the spacing d is changed (by tiny 
amounts). A mechanical actuator can be used to vary the spacing between the 
plates while the transmittance is observed on a detector. To make the alignment 
of the instrument somewhat less critical, a small aperture can be placed in front 
of the detector so that it observes only a small portion of the beam. 

The transmittance as a function of plate separation is shown in Fig. 4.9. In this 
case, <5 varies via changes in d (see (4.26) with cos(9i = 1 and fixed wavelength). 
As the spacing is increased by only a half wavelength, the transmittance changes 
through a complete period. The various peaks in the figure are called fringes. 

The setup for a Fabry-Perot etalon is similar to that of the interferometer 
except that the spacing d remains fixed. Often the two surfaces in the etalon are 
held parallel to each other by a precision spacer. An advantage to the Fabry-Perot 
etalon (as opposed to the interferometer) is that no moving parts are needed. To 
make measurements with an etalon, the angle of the light is varied rather than the 
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plate separation. After all, to see fringes, we just need to cause <S in (4.23) to vary 
in some way. According to (4.26) , we can do that as easily by varying 6\ as we can 
by varying d. One way to obtain a range of angles is to observe light from a 'point 
source', as depicted in Fig. 4.10. Different portions of the beam go through the 
device at different angles. When aligned straight on, the transmitted light forms a 
'bull's-eye' pattern on a screen. 

In Fig. 4.11 we graph the transmittance T tot (4.23) as a function of angle 
(holding A vac = 500 nm and d = 1 cm fixed). Since cos0i is not a linear function, 
the spacing of the peaks varies with angle. As d\ increases from zero, the cosine 
steadily decreases, causing $ to decrease. Each time O decreases by 2n we get a 
new peak. Not surprisingly, only a modest change in angle is necessary to cause 
the transmittance to vary from maximum to minimum, or vice versa. 

The bull's-eye pattern in Fig. 4.10 can be understood as the curve in Fig. 4.11 
rotated about a circle. Depending on the exact spacing between the plates, the 
radii (or angles) where the fringes occur can be different. For example, the center 
spot could be dark. 

Spectroscopic samples often are not compact point-like sources. Rather, they 
are extended diffuse sources. The point-source setup shown in Fig. 4.10 won't 
work for extended sources unless all of the light at the sample is blocked except 
for a tiny point. This is impractical if there remains insufficient illumination at 
the final screen for observation. 

In order to preserve as much light as possible, we can sandwich the etalon 
between two lenses. We place the diffuse source at the focal plane of the first lens. 
We place the screen at the focal plane of the second lens. This causes an image of 
the source to appear on the screen. 7 Each point of the diffuse source is mapped 
to a corresponding point on the screen. Moreover, the light associated with any 
particular point of the source travels as a unique collimated beam in the region 
between the lenses. Each collimated beam traverses the etalon with a unique 
angle. Thus, light associated with each emission point traverses the etalon with 
higher or lower transmittance, according to the differing angles. The result is that 
a bull's eye pattern becomes superimposed on the image of the diffuse source. 
The lens and retina of your eye can be used for the final lens and screen. 
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Figure 4.10 A diverging monochro- 
matic beam traversing a Fabry- 
Perot etalon. (The angle of diver- 
gence is exaggerated.) 
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Figure 4.11 Transmittance 
through a Fabry-Perot etalon 
(F = 10) as the angle 6\ is varied. It 
is assumed that the distance d is 
chosen such that O is a multiple of 
2n when the angle is zero. 
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Instrument 



Figure 4.12 Setup of a Fabry-Perot 
etalon for looking at a diffuse 
source. 



Thus far, we have examined how the transmittance through a Fabry-Perot instru- 
ment varies with surface separation d and angle d\. However, the main purpose 
of a Fabry-Perot instrument is to measure small changes in the wavelength of 
light, which similarly affect the value of <5 (see (4.26)). 8 



7 If the diffuse source has the shape of Mickey Mouse, then an image of Mickey Mouse appears 
on the screen. Imaging techniques are discussed in chapter 9. 

8 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.6.3 (Cambridge University Press, 1999) . 
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Figure 4.13 Transmittance as the 
spacing d is varied for two differ- 
ent wavelengths [F = 100). The 
solid line plots the transmittance 
of light with a wavelength of Ao, 
and the dashed line plots the 
transmittance of a wavelength 
shorter than Aq . Note that the 
fringes shift positions for different 
wavelengths. 



Consider a Fabry-perot interferometer where the transmittance through the 
instrument is plotted as a function of surface separation d. Let the spacing do 
correspond to the case when $ is a multiple of 2n for the wavelength A vac = ^o- 
Next, suppose we adjust the wavelength of the light from A va c = Ao to A va c = 
A + AA while observing the transmittance. As we do this, the value of $ changes. 
Figure 4.13 shows what happens as we scan the spacing d of the interferometer in 
the neighborhood of do. The dashed line corresponds to a different wavelength. 
As the wavelength changes, the plate separation at which a particular fringe 
occurs also changes. 

We now derive the connection between a change in wavelength and the 
amount that $ changes, which gives rise to the fringe shift seen in Fig. 4.13. At 
the wavelength Ao, we have 



47inidocos9i 
O = : 1- o r 



(4.28) 



which we previously supposed is an integer times 2n. At a new wavelength (all 
else remaining the same) we have 



AnnydocosO 



Ao + AA 



- + 5 r 



(4.29) 



The change in wavelength A A is usually very small compared to Ao, so we can 
represent the denominator with the first two terms of a Taylor-series expansion: 



1 



1 



^1-AA/Ao 
A + AA A (1 + AA/A )~ A 

Then the difference between 3>o and $ can be rewritten as 

4nrii do cos 9\ 



AO = <E> - O : 



-AA 



(4.30) 



(4.31) 



If the change in wavelength is enough to cause A$ = 2n, the fringes in Fig. 4.13 
shift through a whole period, and the picture looks the same. 

This brings up an important limitation of the instrument. If the fringes shift 
by too much, we might become confused as to what exactly has changed, owing 
to the periodic nature of the fringes. If two wavelengths aren't sufficiently close, 
the fringes of one wavelength may be shifted past several fringes of the other 
wavelength, and we will not be able to tell by how much they differ. 

This introduces the concept oifree spectral range, which is the wavelength 
change AA FSR that causes the fringes to shift through one period. We find this by 
setting (4.31) equal to 2n. After rearranging, we get 



AA F 



A,, 



2n\d Q cos,6i 



(4.32) 



The free spectral range tends to be extremely narrow; a Fabry- Perot instrument is 
not well suited for measuring wavelength ranges wider than this. In summary, the 
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free spectral range is the largest change in wavelength permissible while avoiding 
confusion. To convert this wavelength difference AA FSR into a corresponding 
frequency difference, one differentiates v = c/A vac to get 



|Av F 



cAA F 



A 2 



(4.33) 



Example 4.4 

A Fabry-Perot interferometer has plate spacing d - 1 cm and index n\ — 1. If it 
is used in the neighborhood of A vac = 500 nm, find the free spectral range of the 
instrument. 



Solution: From (4.32), the free spectral range is 



AA FSR - 



2nirf o cos0] 



= AA FSR = 



(500 nm) 2 



2(1) (1 cm) cos0° 



= 0.0125 nm 



This means that we should not use the instrument to distinguish wavelengths that 
are separated by more than this small amount. 

We next consider the smallest change in wavelength that can be noticed, 
or resolved with a Fabry-Perot instrument. For example, if two very near-by 
wavelengths are sent through the instrument simultaneously, we can distinguish 
them only if the separation between their corresponding fringe peaks is at least 
as large as the width of individual peaks. This situation of two barely resolvable 
fringe peaks is illustrated in Fig. 4.14 for a diverging beam traversing an etalon. 

We will look for the wavelength change that causes a peak to shift by its own 
width. We define the width of a peak by its full width at half maximum (FWHM) . 
Again, let <J> be a multiple of 2n so that a peak in transmittance occurs when 
<E> = $o- In this case, we have from (4.23) that 



(4.34) 



1+Fsin 2 (f ) 



since sin(<I>o/2) = 0. If <5 varies from <&o to $0 + O fwhm /2, then, by definition, the 
transmittance drops to one half. Therefore, we may write 




Figure 4.14 Transmittance of a 
diverging beam through a Fabry- 
Perot etalon. Two nearby wave- 
lengths are sent through the in- 
strument simultaneously, (top) 
barely resolved and (bottom) eas- 
ily resolved. 



j-tot _ 



l + Fsin 2 ^^ 2 ) 



(4.35) 



In solving for (4.35) for $ FW hm> we see that this equation requires 



Fsin 2 



'$ F 



= 1 



(4.36) 



where we have taken advantage of the fact that <J> is a multiple of 2n. Next, 
we suppose that <S FW hm is rather small so that we may represent the sine by its 
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argument. This approximation is okay if the finesse coefficient F is rather large 
(say, 100). With this approximation, (4.36) simplifies to 

4 

Ofwhm = ^- (4.37) 
s/F 

The ratio of the period between peaks 2n to the width 4>fwhm of individual peaks 
is called the reflecting finesse (or just finesse) . 

2n nvF 

f = = (4.38) 

•Sfwhm 2 

This parameter is often used to characterize the performance of a Fabry- Perot 
instrument. Note that a higher finesse / implies sharper fringes in comparison to 
the fringe spacing. 

The free spectral range AA FSR compared to the minimum wavelength AA FWHM 
is the same ratio /. Therefore, we have 

AAfQR A™,. 

AAfwhm = — ^ = — 1= (4.39) 

J nn\d cosQ\\fF 

As a final note, the ratio of An to AA min , where AA min is the minimum change 
of wavelength that the instrument can distinguish in the neighborhood of Ao is 
called the resolving power. For a Fabry- Perot instrument it is 

An 

RP = — (4.40) 

A/lpwHM 

Fabry- Perot instruments tend to have very high resolving powers since they re- 
spond to very small differences in wavelength. 



Example 4.5 

If the Fabry- Perot interferometer in Example 4.4 has reflectivity R = 0.85, find the 
finesse, the minimum distinguishable wavelength separation, and the resolving 
power. 

Solution: From (4.25), the finesse coefficient is 

4R 4 (0.85) 
p - - - - - 151 

(1-i?) 2 (l-(0.85)) 2 



and by (4.38) the finesse is 

nvF 7r\/T51 

f=^— = — = 19.3 

J 2 2 

The minimum resolvable wavelength change is then 



AA FSR 0.0125 nm 

AAfwhm = — = = 0.00065 nm (4.41) 

/ 19 
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The instrument can distinguish two wavelengths separated by this tiny amount, 
which gives an impressive resolving power of 

A V ac 500 nm 

RP= = =772,000 

AAfwhm 0.00065 nm 

For comparison, the resolving power of a typical grating spectrometer is much less 
(a few thousand). However, a grating spectrometer has the advantage that it can 
simultaneously observe wavelengths over hundreds of nanometers, whereas the 
Fabry- Perot instrument is confined to the extremely narrow free spectral range. 



4.7 Multilayer Coatings 

As we saw in Example 4.2, a single coating cannot always accomplish a desired 
effect, especially if the goal is to make a highly reflective mirror. For example, if 
we want to make a mirror surface using a dielectric (i.e. nonmetallic) coating, 
a single layer is insufficient to reflect the majority of the light. In P4.5 we find 
that a single dielectric layer deposited on glass can reflect at most about 46% 
of the light, even when we used a material with very high index. We would like 
to do much better (e.g. >99%), and this can be accomplished with multilayer 
dielectric coatings. Multilayer dielectric coatings can perform considerably better 
than metal surfaces such as silver and have the advantage of being less prone to 
damage. 

In this section, we develop the formalism for dealing with arbitrary numbers 
of parallel interfaces (i.e. multilayer coatings) . 9 Rather than incorporate the single- 
interface Fresnel coefficients into the problem as we did in section 4.1, we will 
find it easier to return to the fundamental boundary conditions for the electric 
and magnetic fields at each interface between the layers. 

We examine p-polarized light incident on an arbitrary multilayer coating (all 
interfaces parallel to each other). It is left as an exercise to re- derive the formalism 
for s-polarized light (see P4.13). The upcoming derivation is valid also for complex 
refractive indices, although our notation suggests real indices. The ability to deal 
with complex indices is very important if, for example, we want to make mirror 
coatings work in the extreme ultraviolet wavelength range where virtually every 
material is absorptive. Consider the diagram of a multilayer coating in Fig. 4.15 
for which the angle of light propagation in each region may be computed from 
Snell's law: 

n smd = riisindi = ■■■ = n N smd N = n N+1 sm6 N+1 (4.42) 

where N denotes the number of layers in the coating. The subscript represents 
the initial medium outside of the multilayer, and the subscript N + 1 represents 
the final material, or the substrate on which the layers are deposited. 

9 G. R. Fowles, Introduction to Modern Optics, 2nded., Sect 4.4 (New York: Dover, 1975); E. Hecht, 
Optics, 3rd ed., Sect. 9.7.1 (Massachusetts: Addison-Wesley, 1998). 
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Figure 4.15 Light propagation through multiple layers. 



In each layer, only two plane waves exist, each of which is composed of light 
arising from the many possible bounces from various layer interfaces. The arrows 
pointing right indicate plane wave fields in individual layers that travel roughly 
in the forward (incident) direction, and the arrows pointing left indicate plane 
wave fields that travel roughly in the backward (reflected) direction. In the final 
region, there is only one plane wave traveling with a forward direction (E*}j_J 
which gives the overall transmitted field. 

As we have studied in chapter 3 (see (3.9) and (3.13)), the boundary conditions 
for the parallel components of the E field and for the parallel components of the 
B field lead respectively to 

cos O [E l P l + £f ) = cos 0! [E[ p _i + E*j) (4.43) 

and 

n [E { £ - Eg?) = Ti\ (Efj - Efl) (4.44) 

Similar equations give the field connection for s-polarized light (see (3.8) and 
(3.14)). 

We have applied these boundary conditions at the first interface only. Of 
course there are many more interfaces in the multilayer. For the connection 
between the j th layer and the next, we may similarly write 

cosOj (/:'/>''*• / <w ' + E fle- ik J £ J cose A = cos<?/ +1 (E« p + ^ + E^ J (4.45) 

and 

nj ( Efl e ik } e j cose j _ E w e -ikje j cose A = n . +i [ E f^-Ef^] (4.46) 

Here we have set the origin within each layer at the left surface. Then when 
making the connection with the subsequent layer at the right surface, we must 
specifically take into account the phase ky ■ [£jz] = kj£ j cosdj. This corresponds 
to the phase acquired by the plane wave field in traversing the layer with thickness 
€ j. The right-hand sides of (4.45) and (4.46) need no phase adjustment since the 
{j + l) th field is evaluated on the left side of its layer. 
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At the final interface, the boundary conditions are 

cos0 N [ E %le ikNeNCOs6N + E^le- ikN£NCOs9N ] = cosd N+1 E^ (4.47) 

and 

n N [e { *1 e ik » e « co&e » - E%1 e -ikNf N cos0 N ^ = nN+lE ^ {4A8 ) 

since there is no backward-traveling field in the final medium. 

At this point we are ready to solve (4.43)-(4.48). We would like to eliminate 
all fields besides E ( p _i , E^l , and E [p) +1 ^ . Then we will be able to find the overall 
reflectance and transmittance of the multilayer coating. In solving (4.43)-(4.48), 
we must proceed with care, or the algebra can quickly get out of hand. Fortunately, 
you have probably had training in linear algebra, and this is a case where that 
training pays off. 

We first write a general matrix equation that summarizes the mathematics in 
(4.43)-(4.48), as follows: 



cos0 je'h 



cos0je l "> 
-rije~ l Pj 



Ef 



where 



and 



kj£ j cos j 



COS0y + i COS0/+1 

nj+i -rij+i 

7 = 
1 < j < N 



pip) 
p (p) 



pip) 



^=0 



(4.49) 



(4.50) 



(4.51) 



(It would be good to take a moment to convince yourself that this set of matrix 
equations properly represents (4.43)-(4.48) before proceeding.) We rewrite (4.49) 
as 



Ef: 

Efl 



cos 



dje^i cosOje 



me 1 * 3 } 



-nje 



COS0 ;+ i COS0/+1 

nj+\ -nj+i 



pip) 



(4.52) 

Keep in mind that (4.52) represents a distinct matrix equation for each differ- 
ent j. We can substitute the j = 1 equation into the 7 = equation to get 



f Efi ] 




cos 00 


cos O 






n 


-n 



M- 



(p) 



COS 62 COS 02 
H2 -Tl2 



Efi 
Et 



(4.53) 



where we have grouped the matrices related to the j = 1 layer together via 

-l 



COS01 COS01 

n\ -n\ 



cos^e^ 1 cos^e - ^ 1 



'Pi 



-n\e 



-iPi 



(4.54) 



We can continue to substitute into this equation progressively higher order equa- 
tions (i.e. for 7 = 2, j = 3, ... ) until we reach the j = N layer. All together this will 
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give 



7 ( P ) 
J o-> 
nip) 



COS 00 cos 00 

n -n 




COS0]V+1 COS0JV+1 



7 (P) 
J JV+1- 





(4.55) 



where the matrices related to the j layer are grouped together according to 

-tp. '-i 



M ( f' = 



COS0y COS Oj 



rije H ' 



-nje~ l Pi 



cos jS/ -ismfijcosdj/rij 
-injsmfij/ cosdj cos/3j 



(4.56) 



The matrix inversion in the first line was performed using (0.35). The symbol n 
signifies the product of the matrices with the lowest subscripts on the left: 



N 



Y[M [p) = M[ P, M ( 2 P, ---M' 
7=1 1 



(4.57) 



As a finishing touch, we divide (4.55) by the incident field E^l as well as perform 
the matrix inversion on the right-hand side to obtain 



1 



pip) /pip) 



\lp) 



pip) I pip> 





(4.58) 



where 



A w 



,ip) 



,(p] 



Jp) 



JP) 



1 [ n cos0o | I fj M w [ COS0W+1 
2n cos6 I n o -cos0o J \jJi i j [ n N+ i 

(4.59) 

In the final matrix in (4.59) we have replaced the entries in the right column with 
zeros. This is permissable since it operates on a column vector with zero in the 
bottom component. 

Equation (4.58) represents two equations, which must be solved simultane- 
ously to find the ratios EfllE^l and J3^+j_, / E%H . Once the matrix A (p> is computed, 
this is a relatively simple task: 



tp — 



MP) 



j(P) 



(Multilayer) (4.60) 



''21 
~JP> 



(Multilayer) (4.61) 



a, 

E [p> ~ a 

The convenience of this notation lies in the fact that we can deal with an 
arbitrary number of layers N with varying thickness and index. The essential 
information for each layer is contained succinctly in its respective 2x2 char- 
acteristic matrix M. To find the overall effect of the many layers, we need only 
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multiply the matrices for each layer together to find A from which we compute 
the reflection and transmission coefficients for the whole system. 

The derivation for s-polarized light is similar to the above derivation for p- 
polarized light. The equation corresponding to (4.58) for s-polarized light turns 
out to be 

1 



-^0— / -^o-* 



7 (5) / pis) 





where 



1 



2^o cos 0o 



no cos 0o 
no cos 0o 



I N 

n Mf 



1 

n,v +1 cos0 A 



and 



AT." = 



cos /3j 
-inj cosOj sin/3j 



-ism/3j/{njcos0j) 
cos (3 j 



(4.62) 






(4.63) 



(4.64) 



The transmission and reflection coefficients are found (as before) from 



pM 
p(s) 

-Co- 



1 

"11 



(Multilayer) (4.65) 



r? = 



p(s) 



(Multilayer) (4.66) 
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Many different types of multilayer coatings are possible. For example, a Brewster's- 
angle polarizer has a coating designed to transmit with high efficiency p-polarized 
light while simultaneously reflecting s-polarized light with high efficiency. The 
backside of the substrate is left uncoated where p-polarized light passes with 
100% efficiency at Brewster's angle. 

Sometimes multilayer coatings are made with repeated stacks of layers. In 
general, if the same series of layers in (4.69) is repeated many times, say q times, 
Sylvester's theorem (see appendix 0.3) can come in handy: 



(4.67) 



A B 


1 


^4sin q6 - sin [q - l) 9 


B sin qQ 


C D 


sinS 


Csin qQ 


D sin qQ - sin [q - l) 6 



where 



cos Q = -{A + D). 



(4.68) 



This formula relies on the condition AD - BC = 1, which is true for matrices of 
the form (4.56) and (4.64) or any product of them. Here, A, B, C, and D represent 
the elements of a matrix composed of a block of matrices corresponding to a 
repeated pattern within the stack. 

In general, high-reflection coatings are designed with alternating high and 
low refractive indices. For high reflectivity, each layer should have a quarter- 
wave thickness. Since the layers alternate high and low indices, at every other 



n H n L n H n L • • • ^ 

ll I 



Figure 4.16 A repeated multilayer 
structure with alternating high 
and low indexes where each layer 
is a quarter wavelength in thick- 
ness. This structure can achieve 
very high reflectance. 
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boundary there is a phase shift of n upon reflection from the interface. Hence, 
the quarter wavelength spacing is appropriate to give constructive interference in 
the reflected direction. 

Example 4.6 

Derive the reflection and transmission coefficients for p polarized light interacting 
with a high reflector constructed using a A/4 stack. 



Solution: For a A/4 stack we need 

71 

This amounts to a thickness requirement of 

A, 



^ _ '"vac 

4rijCos9j 



In this situation, the matrix (4.56) for each layer simplifies to 



M (p) = 
] 



-icosdjlrij 




-irijl cosQj 

The matrices for a high and a low refractive index layer are multiplied together in 
the usual manner. Each layer pair takes the form 

iCOSfiff 













COS01 







cos9ff 

To extend to q — A/72 identical layer pairs, we have 









Til COSd f{ 



N 



n m (p) = 

M 1 



_ ni cosflfj 




(- 



n H cos8 L j 












(- 



TIL COS0H ) 



Substituting this into (4.59), we obtain 

r ( gLCOsflfl W cosgjv+i I hhcosBl Y n N+ i „ 

\ n H cos8 L ) cos0 o [ n L cos8 H j n (4 69) 

/ rc L cosfl H W cos9jy +1 f ggcosgi H m+i „ 

^ nucosBi) cos0o (, nicosduj riQ 

With j4 (p) in hand, we can now calculate the transmission coefficient from (4.60) 



A l " ] = - 
2 



i 

( n^cosgfj l^ cos8jv + i ( nncosB 

\ tlHCO&Ql j COS 00 V «Z.COS0, 

and the " rn ' " 



reflection coefficient from (4.61) 

\ riHCOsBi) cos©o \ «lcos0/j j no 



— (A/4 stack, p-polarized) (4.70) 

cos0 L y n N+ i 
cosd h j no 



I «H cosy^ I cost/o I n^cosofi I 

Xp — 

( n^cosgff l^ cos8jv + i ( njjcosg^ ) 
^ nncosBij cosfio I. nicosdn) 



(A/4 



stack, p-polarized) (4.71) 
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The quarter-wave multilayer considered in Example 4.6 can achieve extraor- 
dinarily high reflectivity. In the limit of q — ► oo, we have t p — ► and r p -> - 1 (see 
Fig. 4.17), giving 100% reflection with a n phase shift. 

-0.5 
-1 

5 10 

q 

Figure 4.17 The transmission and 
reflection coefficients for a quarter 
wave stack as q is varied {ni = 1.38 
and n H = 2.32). 




112 



Chapter 4 Multiple Parallel Interfaces 



Exercises 

Exercises for 4.1 Double-Interface Problem Solved Using Fresnel Coefficients 

P4.1 Use (4.4)-(4.7) to derive rf l given in (4.12). 

P4.2 Consider a 1 micron thick coating of dielectric material (n = 2) on 
a piece of glass (n = 1.5). Use a computer to plot the magnitude of 
the overall Fresnel coefficient (4.11) from air into the glass at normal 
incidence. Plot as a function of wavelength for wavelengths between 
200 nm and 800 nm (assume the index remains constant over this 
range). 

Exercises for 4.2 Two-Interface Transmittance at Sub Critical Angles 

P4.3 Verify that in the case that 0\ and O2 are real that (4.14) simplifies to 
(4.15). 

P4.4 A light wave impinges at normal incidence on a thin glass plate with 
index n and thickness d. 

(a) Show that the transmittance through the plate as a function of 
wavelength is 

jtot _ \ 

HINT: Find 

n-l 
n+ 1 

and then use 

(b) If n = 1.5, what is the maximum and minimum transmittance 
through the plate? 

(c) If the plate thickness is d = 150 ^m, what wavelengths transmit with 
maximum efficiency? Express your answer as a formula involving an 
integer j. 

P4.5 Show that the maximum reflectance possible from the front coating in 
Example 4.2 is 46%. Find the smallest possible d\ that accomplishes 
this for light with wavelength A va c = 633 nm. 
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Exercises for 4.3 Beyond Critical Angle: Tunneling of Evanescent Waves 

P4.6 Re-compute (4.22) in the case of s-polarized light. Write the result in 
the same form as the last expression in (4.22). 

L4.7 Consider s-polarized microwaves (A va c = 3 cm) encountering an air 
gap separating two paraffin wax prisms (n =1.5). The 45° right-angle 
prisms are arranged with the geometry shown in Fig. 4.4. The presence 
of the second prism 'frustrates' the total internal reflection. 




Microwave 
Source 



Paraffin 
Lens 



Paraffin 
Prisms 

Figure 4.19 



Paraffin 
Lens 



Microwave 
Detector 



(a) Use a computer to plot the transmittance through the gap (i.e. the 
result of P4.6) as a function of separation d (normal to gap surface). 
Neglect reflections from other surfaces of the prisms. 

(b) Measure the transmittance of the microwaves through the prisms 
as function of spacing d (normal to the surface) and superimpose the 
results on the graph of part (a). Figure 4.18 shows a plot of typical data 
taken with this setup, (video) 




2 3 4 5 
Separation (cm) 

Figure 4.18 Theoretical vs. mea- 
sured microwave transmission 
through wax prisms. Mismatch is 
presumably due to imperfections 
in microwave collimation and/ or 
extraneous reflections. 



Exercises for 4.6 Distinguishing Nearby Wavelengths in a Fabry -Perot Instru- 
ment 

P4.8 A Fabry-Perot interferometer has silver-coated plates each with re- 
flectance R = 0.9, transmittance T = 0.05, and absorbance A = 0.05. 
The plate separation is d = 0.5 cm with interior index m = l. Suppose 
that the wavelength being observed near normal incidence is 587 nm. 

(a) What is the maximum and minimum transmittance through the 
interferometer? 

(b) What are the free spectral range AA FSR and the fringe width AA FW hm? 

(c) What is the resolving power? 

P4.9 Generate a plot like Fig. 4. 1 1 (a), showing the fringes you get in a Fabry- 
Perot etalon when f?i is varied. Let r max = 1, F = 10, A = 500 nm, 
d = 1 cm, and n\ = 1. 

(a) Plot T vs. 0\ over the angular range used in Fig. 4.1 1 (a). 

(c) Suppose d was slightly different, say 1.00001 cm. Make a plot of T 
vs 6i for this situation. 
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P4.10 Consider the configuration depicted in Fig. 4.10, where the center of the 
diverging light beam X vac = 633 nm approaches the plates at normal 
incidence. Suppose that the spacing of the plates (near d = 0.5 cm) is 
just right to cause a bright fringe to occur at the center. Let n\ = 1. Find 
the angle for the m th circular bright fringe surrounding the central spot 
(the th fringe corresponding to the center). HINT: cos0 = 1 -6 2 12. The 
answer has the form a\fm; find the value of a. 

L4.1 1 Characterize a Fabry- Perot etalon in the laboratory using a HeNe laser 
(A V ac = 633 nm). Assume that the bandwidth AA HeNe of the HeNe laser 
is very narrow compared to the fringe width of the etalon AA FW hm- 
Assume two identical reflective surfaces separated by 5.00 mm. Deduce 
the free spectral range AA FSR , the fringe width AA FWH m, the resolving 
power, and the reflecting finesse (small /) . (video) 




Fabry-Perot CCD 
Etalon Camera 



Figure 4.20 



o 



r_ 



Filter 

I o 



Fabry-Perot 
Etalon 

Figure 4.21 



CCD 
Camera 



L4.12 Use the same Fabry-Perot etalon to observe the Zeeman splitting of the 
yellow line A = 587.4 nm emitted by a krypton lamp when a magnetic 
field is applied. As the line splits and moves through half of the free 
spectral range, the peak of the decreasing wavelength and the peak of 
the increasing wavelength meet on the screen. When this happens, by 
how much has each wavelength shifted? (video) 



Exercises for 4. 7 Multilayer Coatings 

P4.13 (a) Write (4.43) through (4.48) for s-polarized light, 
(b) From these equations, derive (4.62)-(4.64). 

P4.14 Show that (4.65) for a single layer (i.e. two interfaces), is equivalent to 
(4.11). WARNING: This is more work than it may appear at first. 



Exercises for 4.8 Repeated Multilayer Stacks 

P4.15 (a) What should be the thickness of the high and the low index layers in 
a periodic high-reflector mirror? Let the light be p-polarized and strike 
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the mirror surface at 45° . Take the indices of the layers be n H = 2.32 
and n L = 1.38, deposited on a glass substrate with index n = 1.5. Let 
the wavelength be A vac = 633 nm. 

(b) Find the reflectance R with 1, 2, 4, and 8 periods in the high-low 
stack. 

P4.16 Find the high-reflector matrix for s-polarized light that corresponds to 
(4.69). 

P4.17 Design an anti-reflection coating for use in air (assume the index of air 
is 1): 

(a) Show that for normal incidence and A/4 films (thickness= \ the 
wavelength of light inside the material), the reflectance of a single layer 
[til) coating on a glass is 




(b) Show that for a two coating setup (air- n\-n2 -glass; n\ and ri2 are 
each a A/4 film), that 



R = 



n 2 2 + n s n\, 



(c) If n s = 1.5, and you have a choice of these common coating ma- 
terials: ZnS {n = 2.32), CeF {n = 1.63) and MgF (n = 1.38), find the 
combination that gives you the lowest R for part (b). (Be sure to specify 
which material is ri\ and which is «2-) What R does this combination 
give? 

P4.18 Consider a two-coating 'anti-reflection optic' (each coating set for 
A/4, as in problem P4.17) using n\ = 1.6 and n,2 = 2.1 applied to a 
glass substrate n g = 1.5 at normal incidence. Suppose the coating 
thicknesses are optimized for A = 550 nm (in the middle of the visible 
range) and ignore possible variations of the indices with A. Use the 
matrix techniques and a computer to plot i?(A air ) for 400 to 700 nm 
(visible range). Do this for a single bilayer (one layer of each coating), 
two bilayers, four bilayers, and 25 bilayers. 



Chapter 5 

Propagation in Anisotropic Media 



To this point, we have considered only isotropic media where the susceptibility 
X{a>) (and hence the index of refraction) is the same for all propagation directions 
and polarizations. In anisotropic materials, such as crystals, it is possible for 
light to experience a different index of refraction depending on the orientation 
(i.e. polarization) of the electric field E. This difference in the index of refraction 
occurs when the direction and strength of the induced dipoles depends on the 
lattice structure of the material in addition to the propagating field. 1 The unique 
properties of anisotropic materials make them important elements in many 
optical systems. 

In section 5.1 we discuss how to connect E and P in anisotropic media using a 
susceptibility tensor. In section 5.2 we apply Maxwell's equations to a plane wave 
traveling in a crystal. The analysis leads to Fresnel's equation, which relates the 
components of the k- vector to the components of the susceptibility tensor. In 
section 5.3 we apply Fresnel's equation to a uniaxial crystal (e.g. quartz, sapphire) 
where Xx = Xy ^ Xz- In the context of a uniaxial crystal, we show that the Poynting 
vector and the k-vector are generally not parallel. 

More than a century before Fresnel, Christian Huygens successfully described 
birefringence in crystals using the idea of elliptical wavelets. His method gives 
the direction of the Poynting vector associated with the extraordinary ray in a 
crystal. It was Huygens who coined the term 'extraordinary' since one of the 
rays in a birefringent material appeared not to obey Snell's law. Actually, the 
k-vector always obeys Snell's law, but in a crystal, the k-vector points in a different 
direction than the Poynting vector, which delivers the energy seen by an observer. 
Huygens' approach is outlined in Appendix 5.D. 

5. 1 Constitutive Relation in Crystals 

In a anisotropic crystal, asymmetries in the lattice can cause the medium polar- 
ization P to respond in a different direction than the electric field E (i.e. P ^ eo^E). 

^ot all crystals are anisotropic. For instance, crystals with a cubic lattice structure (such as 
NaCl) are highly symmetric and respond to electric fields the same in any direction. 
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Figure 5.1 A physical model of an 
electron bound in a crystal lat- 
tice with the coordinate system 
specially chosen along the princi- 
pal axes so that the susceptibility 
tensor takes on a simple form. 




Figure 5.2 The applied field E 
and the induced polarization P in 
general are not parallel in a crystal 
lattice. 



However, at low intensities the response of materials is still linear (or propor- 
tional) to the strength of the electric field. The linear constitutive relation which 
connects P to E in a crystal can be expressed in its most general form as 



Px 




Xxx 


Xxy 


Xxz 




' E x ■ 




p y 


= £o 


Xyx 


Xyy 


Xyz 




E y 


(5.1) 


. Pz 




. Xzx 


Xzy 


Xzz . 









The matrix in (5.1) is called the susceptibility tensor. To visualize the behavior 
of electrons in such a material, we imagine each electron bound as though by 
tiny springs with different strengths in different dimensions to represent the 
anisotropy (see Fig. 5.1). When an external electric field is applied, the electron 
experiences a force that moves it from its equilibrium position. The 'springs' 
(actually the electric force from ions bound in the crystal lattice) exert a restoring 
force, but the restoring force is not equal in all directions — the electron tends to 
move more along the dimension of the weaker spring. The displaced electron 
creates a microscopic dipole, but the asymmetric restoring force causes P to be in 
a direction different than E as depicted in Fig. 5.2. 

To understand the geometrical interpretation of the many coefficients %ij> 
assume, for example, that the electric field is directed along the x-axis (i.e. E y = 
E z = 0) as depicted in Fig. 5.2. In this case, the three equations encapsulated in 
(5.1) reduce to 



Px — CoXxxEx 
Pv = £oXyxE x 
£ oXzxEx 



y 

Pz 



Notice that the coefficient Xxx connects the strength of P in the x direction with 
the strength of E in that same direction, just as in the isotropic case. The other two 
coefficients {%y X and %zx) describe the amount of polarization P produced in the 
y and z directions by the electric field component in the x- dimension. Likewise, 
the other coefficients with mixed subscripts in (5.1) describe the contribution to 
P in one dimension made by an electric field component in another dimension. 

As you might imagine, working with nine susceptibility coefficients can get 
complicated. Fortunately, we can greatly reduce the complexity of the description 
by a judicious choice of coordinate system. In Appendix 5.A we explain how 
conservation of energy requires that the susceptibility tensor (5.1) for typical 
non-aborbing crystals be real and symmetric (i.e. %ij — XjO- 2 

Appendix 5.B shows that, given a real symmetric tensor, it is always possible 
to choose a coordinate system for which off-diagonal elements vanish. This is 
true even if the lattice planes in the crystal are not mutually orthogonal (e.g. 
rhombus, hexagonal, etc.) . We will imagine that this rotation of coordinates 



2 By 'typical' we mean that the crystal does not exhibit optical activity. Optically active crystals 
have a complex susceptibility tensor, even when no absorption takes place. Conservation of energy 

*:*•). 



in this more general case requires that the susceptibility tensor be Hermitian = X ] 
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has been accomplished. In other words, we can let the crystal itself dictate the 
orientation of the coordinate system, aligned to the principal axes of the crystal 
for which the off-diagonal elements of (5.1) are zero 

With the coordinate system aligned to the principal axes, the constitutive 
relation for a non absorbing crystal simplifies to 



' Px 




' Xx o 







' E x ■ 




Py 


= e 


Xy 







E y 


(5.2) 


[ Pz \ 







Xz . 




L E z \ 





or without the matrix notation (since it no longer offers much convenience) 

V = xe XxE x +ye XyEy+ie XzE z (5.3) 

By assumption, % x , % y , and %z are all real. (We have dropped the double subscript; 
Xx stands for Xxx, etc.) 



5.2 Plane Wave Propagation in Crystals 

We consider a plane wave with frequency a) propagating in a crystal. In a manner 
similar to our previous analysis of plane waves propagating in isotropic materials, 
we write as trial solutions 



E 


= E e i(k - r - 


-<ot) 


B 


= B ^ (k - r " 


-ait) 


P 


= P e'' (k - r " 


-o)t) 



where restrictions on E , B„, P , and k are yet to be determined. As usual, the 
phase of each wave is included in the amplitudes E , B , and P , whereas k is real 
in accordance with our assumption of no absorption. 

We can make a quick observation about the behavior of these fields by apply- 
ing Maxwell's equations directly. Gauss's law for electric fields requires 

V-(£- E + P)=k-(c E + P)=0 (5.5) 

and Gauss's law for magnetism gives 

VB=kB=0 (5.6) 

We immediately notice the following peculiarity: From its definition, the Poynting 
vector SsEx B/p is perpendicular to both E and B, and by (5.6) the k-vector is 
perpendicular to B. However, by (5.5) the k-vector is not necessarily perpendicu- 
lar to E, since in general k • E ^ if P points in a direction other than E. Therefore, 
k and S are not necessarily parallel in a crystal. In other words, the flow of energy 
and the direction of the phase-front propagation can be different in anisotropic 
media. 
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Our main goal here is to relate the k-vector to the susceptibility parameters % x , 
X y , and Xz- To do this, we plug our trial plane-wave fields into the wave equation 
(1.41). Under the assumption Jf ree = 0, we have 



d 2 E 



d 2 P 



V*E - ^ e —-^ = Ho -r~2 + V (V • E) 



(5.7) 



Derivation of the dispersion relation in crystals 

We begin by substituting the trial solutions (5.4) into the wave equation (5.7). After 
carrying out the derivatives we find 

fc 2 E-w 2 /u (e E + P) = k(k E) (5.8) 

Inserting the constitutive relation (5.3) for crystals into (5.8) yields 

k 2 E- co 2 ^ e [(1 + Xx) E x x+ [l + Xy) E y y+ (l + Xz) E z z] = k (k • E) (5.9) 

This relationship is unwieldy because of the mix of electric field components that 
appear in the expression. This was not a problem when we investigated isotropic 
materials for which the k-vector is perpendicular to E, making the right-hand side 
of the equations zero. However, there is a trick for dealing with this. 

Relation (5.9) actually contains three equations, one for each dimension. Explicitly, 
these equations are 



CO 



and 



i l + Xx) 



(l + Xy) 



E x = k x (k-E) 



E y = fc y (k-E) 



£ z = Mk-E) 



(5.10) 
(5.11) 

(5.12) 



We have replaced the constants /i c with 1/c 2 in accordance with (1.43). We 
multiply (5.10)-(5.12) respectively by k x , k y , and k z and also move the factor in 
square brackets in each equation to the denominator on the right-hand side. Then 
by adding the three equations together we get 



kz (k-E) 



fc 2 (k-E) 



fc 2 (k-E) 



fc 2 - 



fc 2 - 



/c 2 - 



^ 2 (i+^) 



= k x E x + kyEy + k z E z = On- 1 E) 



(5.13) 

Now k - E appears in every term and can be divided away. This gives the dispersion 
relation (unencumbered by field components): 



ky 



[k 2 c 2 lo 2 -{l + Xx)] [k 2 c 2 /co 2 -{l + Xy )] [k 2 c 2 /co 2 -{l + Xz )] c 2 
As a final touch, we have multiplied the equation through by co 2 /c 2 



(5.14) 
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The dispersion relation (5.14) allows us to find a suitable k, given values for a), 
% x , %y, and % z . Actually, it only restricts the magnitude of k; we must still decide 
on a direction for the wave to travel (i.e. we must choose the ratios between k x , k y , 
and k z ). To remind ourselves of this fact, we introduce a unit vector that points in 
the direction of k 

k= k x x + k y y+ k z z = k[u x x + u y f + u z z] = ku (5.15) 

With this unit vector inserted, the dispersion relation (5.14) for plane waves in a 
crystal becomes 

III u l UZ 0) 2 

+ r ,„ ,, y -r TT+ r,, o. o 7T = To 9 (5.16) 



[k 2 c 2 /a, 2 -[l + X x)] [k 2 c 2 /co 2 -[l + X y)] [k 2 c 2 la> 2 - (l + Xz )\ k 2 c 

We may define refractive index as the ratio of the speed of light in vacuum 
c to the speed of phase propagation in a material a>lk (see PI. 9). The relation 
introduced for isotropic media (i.e. (2.19) for real index) remains appropriate. 
That is 

kc 

n=— (5.17) 

0) 

This familiar relationship between k and (o, in the case of a crystal, depends on 
the direction of propagation in accordance with (5.16). 

Inspired by (2.30), we will find it helpful to introduce several refractive-index 
parameters: 

n x = V l + Xx 

n y = ^ l+ Xy (5-18) 
n z = y/ 1 + Xz 

With these definitions (5.17)-(5.18), the dispersion relation (5.16) becomes 

u 2 r u 2 u 2 7 i 

± i £ i ± = (5.19) 

(n 2 - n 2 ) [n 2 - n 2 ) [n 2 - n 2 ) n 2 

This is called Fresnel's equation (not to be confused with the Fresnel coefficients 
studied in chapter 3) . The relationship contains the yet unknown index n that 
varies with the direction of the k-vector (i.e. the direction of the unit vector u). 

After multiplying through by all of the denominators (and after a fortuitous 
cancelation owing to u 2 + u 2 + u 2 = I), Fresnel's equation (5.19) can be rewritten 
as a quadratic in n 2 . The two solutions are 

9 B±VB 2 -4AC 
n 2 = (5.20) 



2A 



where 



A= u 2 n 2 + u 2 n 2 + u 2 z n 2 z (5.21) 
B = u\n 2 x \n 2 + n 2 ^ + u 2 n 2 [n 2 + n 2 .) + u\n 2 z \n 2 + n 2 } (5.22) 
C=n\n 2 n 2 z (5.23) 
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The upper and lower signs (+ and -) in (5.20) give two positive solutions for n 2 . 
The positive square root of these solutions yields two physical values for n. It turns 
out that each of the two values for n is associated with a polarization direction 
of the electric field, given a propagation direction k. A broader analysis carried 
out in appendix 5.C renders the orientation of the electric fields, whereas here we 
only show how to find the two values of n. We refer to the two indices as the slow 
and fast index, since the waves associated with each propagate at speed v = cln. 

In the special cases of propagation along one of the principal axes of the 
crystal, the index n takes on two of the values n x , n y , or n z , depending on which 
are orthogonal to the direction of propagation. 

Example 5.1 

Calculate the two possible values for the index of refraction when k is in the z 
direction (in the crystal principal frame). 

Solution: With u z -\ and u x -u y -0 we have 

A=n 2 z ; B = nl[nl + n 2 y}-, C = n 2 x n 2 y n 2 z 

The square-root term is then 

VB 2 ^4AC=yJ n\ [n% + 2n z x n 2 y + n 4 y ) - An 2 x n 2 n\ 

= \Jnl{n 2 x -n 2 y f 
= n 2 z [n 2 x -n 2 y ] 

Inserting this expression into (5.20), we find the two values for the index: 

n = n x , n y 

The index n x is experienced by light whose electric field points in the x-dimension, 
and the index n Y is experienced by light whose electric field points in the y~ 
dimension (see appendix 5. C ). 

Before moving on, let us briefly summarize what has been accomplished so 
far. Given values for % x , % y , and %z associated with light in a crystal at a given 
frequency, you can define the indices n x , n y , and n z , according to (5.18). Next, a 
direction for the k-vector is chosen (i.e. u x , u y , and u z ). This direction generally 
has two values for the index of refraction associated with it, found using Fresnel's 
equation (5.20). Each index is associated with a specific polarization direction 
for the electric field as outlined in appendix 5.C. Every propagation direction u 
has its own natural set of polarization components for the electric field. The two 
polarization components travel at different speeds, even though the frequency is 
the same. This is known as birefringence. 



z 




Figure 5.3 Spherical coordinates. 
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5.3 Biaxial and Uniaxial Crystals 



All anisotropic crystals have certain special propagation directions where the two 
values for n from Fresnel's equation are equal. These directions are referred to as 
the optic axes of the crystal. The optic axes do not necessarily coincide with the 
principal axes x, y, and z. When propagation is along an optic axis, all polarization 
components experience the same index of refraction. If the values of n x , n y , and 
n z are all unique, a crystal will have two optic axes, and hence is referred to as a 
biaxial crystal. 

It is often convenient to use spherical coordinates to represent the compo- 
nents of u (see to Fig. 5.3): 

u x = sin 9 cos <p 

Uy = sinOsm(p (5.24) 
u z = COS0 

Here 6 is the polar angle measured from the z-axis of the crystal and </> is the 
azimuthal angle measured from the x-axis of the crystal. These equations em- 
phasize the fact that there are only two degrees of freedom when specifying 
propagation direction [6 and (p). It is important to remember that these angles 
must be specified in the frame of the crystal's principal axes, which are often not 
aligned with the faces of a cut crystal in an optical setup. 

By convention, we order the crystal axes for biaxial crystals so that n x < n y < 



n z . Under this convention, the two optic axes occur in the x-z plane ((/> 
two values of the polar angle 6, measured from the z-axis (see P5.3): 



0) at 




COS0 = ± 



lly\ 



(Optic axes directions, biaxial crystal) (5.25) 



While finding the direction of the optic axes in a biaxial crystal is not too bad, an 
expression for the two indices of refraction is messy. The smaller value is com- 
monly referred to as the 'fast' index and the larger value the 'slow' index. Figure 5.4 
shows the two refractive indices (i.e. the solutions to Fresnel's equation (5.20)) for 
a biaxial crystal plotted with color shading on the surface of a sphere. Each point 
on the sphere represents a different 6 and (p. The two optic axes are apparent 
in the plot of the difference between n s iow and /if as t- When propagating in these 
directions, either polarization experiences the same index. For the remainder of 
this chapter, we will focus on the simpler case of uniaxial crystals. 

In uniaxial crystals two of the coefficients % x , % y , and Xz are the same. In 
this case, there is only one optic axis for the crystal (hence the name uniaxial). 
By convention, in uniaxial crystals we label the dimension that has the unique 
susceptibility as the z-axis (i.e. %x = Xy ^ Xz)- This makes the z-axis the optic axis. 
The unique index of refraction is called the extraordinary index 




Figure 5.4 The fast and slow re- 
fractive indices (and their differ- 
ence) as a function of direction 
for potassium niobate (KNbOs) at 
A = 500 nm (n x = 2.22, n y = 2.35, 
and n z = 2.41) . 



n z = n e 



(5.26) 
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Figure 5.5 The extraordinary and 
ordinary refractive indices (and 
their difference) as a function of 
direction for beta barium borate 
(BBO) at A = 500 nm (n = 1.68 
and n e = 1.56). 



and the other index is called the ordinary index 

n x = n y = n (5.27) 

These names were coined by Huygens, one of the early scientists to study light 
in crystals (see appendix 5.D). A uniaxial crystal with n e > n is referred to as a 
positive crystal, and one with n e < n is referred to as a negative crystal. 

To calculate the index of refraction for a wave propagating in a uniaxial crystal, 
we use definitions (5.26) and (5.27) along with the spherical representation of u 
(5.24) in Fresnel's equation (5.20) to find the following two values for n (see P5.4): 

n = n (uniaxial crystal) (5.28) 

and 

n n e 

n = n e {9) = — (uniaxial crystal) (5.29) 

J nl sin 2 8 + n\ cos 2 9 

The index n e {9) in (5.29) is also commonly referred to as the extraordinary index 
along with the constant n e = n z . While this has the potential for some confusion, 
the practice is so common that we will perpetuate it here. We will write n e {6) 
when the angle dependent quantity specified by (5.29) is required, and write n e 
in formulas where the constant (5.26) is called for (as in the right hand side of 
(5.29)). Notice that n e {9) depends only on 9 (the polar angle measured from the 
optic axis z) and not </> (the azimuthal angle). Figure 5.5 shows the two refractive 
indices (5.28) and (5.29) as a function 9 and (p. Since n e {9) has no (p dependence 
and n is constant, the variation is much simpler than for the biaxial case. 

As outlined in appendix 5.C, the index n corresponds to an electric field 
component that points perpendicular to the plane containing u and z (e.g. if 
u is in the x-z plane, n is associated with light polarized in the y-dimension). 
On the other hand, the index n e {9) corresponds to field polarization that lies 
within the plane containing u and z. In this case, the polarization component 
is directed partially along the optic axis (i.e. it has a z-component). That is why 
(5.29) gives for the refractive index a mixture of n Q and n e . If 9 = 0, then the 
k- vector is directed exactly along the optic axis, and n e {9) reduces to n so that 
both polarization components experience same index n . 



5.4 Refraction at a Uniaxial Crystal Surface 

Next we consider refraction as light enters a uniaxial crystal. Snell's law (3.7) 
describes the connection between the k-vectors incident upon and transmitted 
through the surface. We must consider separately the portion of the light that ex- 
periences the ordinary index from the portion that experiences the extraordinary 
index. Because of the different indices, the ordinary and extraordinary polarized 
light refract into the crystal at two different angles; they travel at two different 
velocities in the crystal; and they have two different wavelengths in the crystal. 
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If we assume that the index outside of the crystal is one, Snell's law for the 
ordinary polarization is 



where n is the ordinary index inside the crystal. The extraordinary polarized 
light also obeys Snell's law, but now the index of refraction in the crystal depends 
on direction of propagation inside the crystal relative to the optic axis. Snell's law 
for the extraordinary polarization is 



where 9' is the angle between the optic axis inside the crystal and the direction of 
propagation in the crystal (given by 9 t in the plane of incidence). When the optic 
axis is at an arbitrary angle with respect to the surface the relationship between 
9' and 9 t is cumbersome. We will examine Snell's law only for the specific case 
when the optic axis is perpendicular to the crystal surface, for which 9 t = 9' '. 

Example 5.2 

Derive Snell's law for a uniaxial crystal with optic axis perpendicular to the surface. 

Solution: Refer to Fig. 5.6. With the optic axis perpendicular to the surface, if the 
light hits the crystal surface straight on, the index of refraction is n , regardless of 
the orientation of polarization since 6' = 0. When the light strikes the surface at an 
angle, s-polarized light continues to experience the index n , while p-polarized 
light experiences the extraordinary index n e {8). 3 

When we insert (5.29) into Snell's law (5.31) with & = 6 t , the expression can be 
inverted to find the transmitted angle 6 t in terms of 0; (see P5.5): 

n e sin0; 

tanfc/t = — (extraordinary polarized, optic axis ± surface) (5.32) 



As strange as this formula may appear, it is Snell's law, but with an angularly 
dependent index. 



5.5 Poynting Vector in a Uniaxial Crystal 

When an object is observed through a crystal (acting as a window), the energy 
associated with ordinary and extraordinary polarized light follow different paths, 

3 The correspondence between s and p and ordinary and extraordinary polarization components 
is specific to the orientation of the optic axis in this example. For arbitrary orientations of the 
optic axis with respect to the surface, the ordinary and extraordinary components will generally be 
mixtures of s and p polarized light. 



y-axis 



t 

z-axis 




kj n\ = 1 


X-axis (directed into page) 



Figure 5.6 Propagation of light in a 
uniaxial crystal with its optic axis 
perpendicular to the surface. 



sinfli = n o sin0 t 



(ordinary polarized light) (5.30) 



sinf?i = n e {8')sm9 t 



(extraordinary polarized light) (5.31) 
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giving rise to two different images. This phenomenon is one of the more com- 
monly observed manifestations of birefringence. Since the Poynting vector dic- 
tates the direction of energy flow, it is the direction of S that determines the 
separation of the double image seen when looking through a birefringent crystal. 

Snell's law dictates the connection between the directions of the incident 
and transmitted k- vectors. The Poynting vector S for purely ordinary polarized 
light points in the same direction as the k- vector, so the direction of energy flow 
for ordinary polarized light also obeys Snell's law. However, for extraordinary 
polarized light, the Poynting vector S is not parallel to k (recall the discussion in 
connection with (5.5) and (5.6)) . Thus, the energy flow associated with extraordi- 
nary polarized light does not obey Snell's law. When Christiaan Huygens saw this 
in the 1600s, he exclaimed "how extraordinary!" Huygens' method for describing 
the phenomenon is outlined appendix 5.D. 

To analyze this situation, it is necessary to derive an expression for extraordi- 
nary polarized light similar to Snell's law, but which applies to S rather than to k. 
This describes the direction that the energy associated with extraordinary rays 
takes upon entering the crystal. To calculate the direction that the extraordinary 
polarized S takes upon entering a crystal, we first calculate the direction of k 
inside the crystal using Snell's law (5.31). Then we use the expression (5.62) for E 
along with B = (k x E)/ai, to evaluate S = E x B/jU . In general, this process is best 
done numerically, since Snell's law (5.31) for extraordinary polarized light usually 
does not have simple analytic solutions. 

Example 5.3 

Find a relationship between direction of the Poynting Vector in a uniaxial crystal 
and the angle of incidence in the special case where the optic axis is perpendicular 
to the surface. 

Solution: To find the direction of energy flow, we must calculate S=ExB/fi B . We 
will need to know E associated with n e {8). We can obtain E from the procedures 
outlined in appendix 5.C. Equivalently we can obtain it from the constitutive 
relation (5.3) with the definitions (5.18): 



Let the k- vector lie in the y-z plane. We may write it as k = fc(ysin0 t + zcos0 t )- 
Then the ordinary component of the field points in the x-direction, while the 
extraordinary component lies in the y-z plane. 

Equation (5.33) requires 

k- (e E + P) = fc(ysin0 t + zcos0 t ) - Co [n 2 E x yi+ n 2 E y y+ n 2 E z z\ 



e E + P = e [(1 + Xx) E x x+{1 + Xy ) E y y+ (l + X z) E z z] 
= e [n 2 E x ±+ n 2 E y y+ n 2 e E z i) 



(5.33) 



= e k[n 2 E y sm8 t + n 2 E z cosQx) 
= 



(5.34) 
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Therefore, the y and z components of the extraordinary field are related through 

(5.35) 



r&Ey 
E 7 = — V-tan0 t 



We may write the extraordinary polarized electric field as 

,2 



E = E y |y- tan0 t j e ? *' r_ft,t) (extraordinary polarized) (5.36) 

The associated magnetic field (see (2.56)) is 

l- „ v fc(ysin0 t + zcos0 t ) x E v ly-z% tan0 t ] 

B _ kxb _ VJ 1 y V "e ) i(k-r-<uQ 

= _ X ^Z ( 4 sin0 t tan0 t + cos0 t ) 



(extraordinary polarized) (5.37) 



The time-averaged Poynting vector then becomes 



f B 



(S> t = (Re{E}xRef 



I ^0 



y-z-f tanflt 



/fig ^ 
— j sin^ttanflt + cos$t 



x ( cos (k • r - ft) t + (p£ 



y it 



k\Ey\ 2 (^ 



2fi ci) 



^ sin 0t tan 0t + cos 8i 



z + y-f tan0, 



(extraordinary polarized) (5.38) 



Let us label the direction of the Poynting vector with the angle 0s- By definition, 
the tangent of this angle is the ratio of the two vector components of S: 



S n z 
tan0 s = / = -ftan0 t 



(extraordinary polarized) (5.39) 



While the k- vector is characterized by the angle t , the Poynting vector is char- 
acterized by the angle 0s- Combining (5.32) and (5.39), we can connect 0s to the 
incident angle 0;: 



tan0 s = 



n sin0i 



(extraordinary polarized) (5.40) 



n e \ rig - sin 0; 



As we noted in the last example, we have the case where ordinary polarized light is 
5-polarized light, and extraordinary polarized light is p-polarized light due to our 
specific choice of orientation for the optic axis in this section. In general, the s- and 
p-polarized portions of the incident light can each give rise to both extraordinary 
and ordinary rays. 
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Appendix 5.A Symmetry of Susceptibility Tensor 

Here we show that the assumption of a non-absorbing (and non optically active) 
medium implies that the susceptibility tensor is symmetric. We assume that P is 
due to a single species of electron, so that we have P = A/p. Here N is the number 
of microscopic dipoles per volume and p = q e r, where q e is the charge on the 
electron and r is the microscopic displacement of the electron. The force on this 
electron due to the electric field is given by F = ~Eq e . With these definitions, we 
can use (5.1) to write a connection between the force due to a static E and the 
electron displacement: 



Nq e 



X 


_ e 


Xxx 




y 




Xyx 


z 


q e 


. Xzx 





' F x 






Fy 


(5.41) 




. F z . 





The column vector on the left represents the components of the displacement 
r. We next invert (5.41) to find the force of the electric field on an electron as a 
function of its displacement 4 



' Fx 




kxx 


k X y 


kxz 


Fy 




kyx 


kyy 


ky z 


. F z . 




. k zx 


k z y 


kzz . 





X 






y 


(5.42) 




z 





where 



kxx 


kxy 


kxz 




kyx 


kyy 


kyz 




kzx 


kzy 


k Z z . 





Nql 
e 



Xxx 


Xxy 


Xxz 


Xyx 


Xyy 


Xyz 


Xzx 


Xzy 


Xzz 



(5.43) 



The total work done on an electron in moving it to its displaced position is 
given by 



W 



= f 1 

Jpath 



F(r')-dr' 



(5.44) 



While there are many possible paths for getting the electron to any specific dis- 
placement (each path specified by a different history of the electric field) , the 
work done along any of these paths must be the same if the system is conservative 
(i.e. no absorption). For example, for a final displacement of r = xx + yy we could 
have the following two paths: 



it— 





(x,y,0) 


; 




! Pathl 





Path2 (x,y,0) 
> 



(0,0,0) 



(0,0,0) 



4 This inversion assumes the field changes slowly so the forces on the electron are always es- 
sentially balanced. This is not true for optical fields, but the proof gives the right flavor for why 
conservation of energy results in the symmetry. A more formal proof that doesn't make this as- 
sumption can be found in Principles of Optics, 7th Ed., Born and Wolf, pp. 790-791. 
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We can use (5.42) in (5.44) to calculate the total work done on the electron 
along path 1: 



W 



= J F x {x',y' = 0,z' = 0)dx' + J F y {x' = x,y',z' = 0)dy' 

= [ k xx x' dx' + [ {k yx x + k yy y') dy' 
Jo Jo 



k X x 2,i , kyy 2 
= — x +k yx xy+—y 

If we take path 2, the total work is 



W 



= j y F y (x' = 0,y',z' = 0)dy' + J F x (x',y' = y,z' = 0)dx' 

= [ k yy y' dy' + [ {k xx x' + k xy y) dx' 
Jo Jo 

kyy 2,7 , kxx 2 

= — y +k xy xy+—x 

Since the work must be the same for these two paths, we clearly have k xy = k yx . 
Similar arguments for other pairs of dimensions ensure that the matrix of k 
coefficients is symmetric. From linear algebra, we learn that if the inverse of a 
matrix is symmetric then the matrix itself is also symmetric. When we combine 
this result with the definition (5.43), we see that the assumption of no absorption 
requires the susceptibility matrix to be symmetric. 



Appendix 5.B Rotation of Coordinates 

In this appendix, we go through the labor of showing that (5.1) can always be 
written as (5.3) via rotations of the coordinate system, given that the susceptibility 
tensor is symmetric (i.e. %ij = Xji)- We have 

P = ColE (5.45) 

where 





' E x 




' Px 




Xxx 


Xxy 


Xxz 




E = 


E y 




Py 


X = 


Xxy 


Xyy 


Xyz 


(5.46) 




. E z . 




. Pz . 




. Xxz 


Xyz 


Xzz . 





Our task is to find a new coordinate system x', y' , and z' for which the susceptibil- 
ity tensor is diagonal. That is, we want to choose x', y' , and z' such that 

P' = Col'E', (5.47) 

where 





E' 

n x' 




P' 

x' 




Xx'x' 










E' = 


E' 


P's 


P' 












(5.48) 




E', 
L z' J 




P' , 

L z' J 










%z'z' . 
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To arrive at the new coordinate system, we are free to make pure rotation trans- 
formations. In a manner similar to (6.29), a rotation through an angle y about the 
z-axis, followed by a rotation through an angle /5 about the resulting y-axis, and 
finally a rotation through an angle a about the new x-axis, can be written as 



R n R12 Rn 

R = J?2i R22 R23 

. R31 R32 R33 
10 

= cos a sin a 

- sin a cos a 



cos/3 sin/3 

1 
-sin/j cos/j 



cos /3 cosy 
- cos a sin y - sin a sin /3 cosy 



cosy siny 
-siny cosy 
1 



cos /3 siny sin/3 
cosacosy- sin a sin /3 siny sinacos/3 



sin a sin y - cos a sin cosy - sin a cos y -cos a sin /3 siny cosacos/3 



(5.49) 



The matrix R produces an arbitrary rotation of coordinates in three dimensions. 
Specifically, we can write: 

E' = RE 

(5.50) 

P' = RP 

These transformations can be inverted to give 

E = R 1 E' 

(5.51) 

P = R _1 P' 

where 



cos/3cosy - cosasiny- sin a sin /3 cosy sin a siny - cos a sin /3 cosy 

R _1 = cos/3siny cosacosy- sin a sin /3 siny - sin a cos y -cos a sin /3 siny 

sin/3 sin a cos /3 cos a cos /3 

f R n R 2 i R31 



R\2 R22 R32 
R13 R23 R33 



= R J 



(5.52) 



Note that the inverse of the rotation matrix is the same as its transpose, an impor- 
tant feature that we exploit in what follows. 
Upon inserting (5.51) into (5.45) we have 

R _1 P' = 6r ^R _1 E' (5.53) 

or 

V = e RxR- 1 B! (5.54) 
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From this equation we see that the new susceptibility tensor we seek for (5.47) is 

^RjR" 1 
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X x 'y' 


k' , 
A y y 


Xy> 
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. Xx'z' 


Xy'z 


X z 'z' . 















(5.55) 



We have expressly indicated that the off-diagonal terms of %' are symmetric (i.e. 
x\j = x'ji)- This can be verified by performing the multiplication in (5.55). It is a 
consequence of % being symmetric and R -1 being equal to R r 

The three off-diagonal elements of %' (appearing both above and below the 
diagonal) are found by performing the matrix multiplication in the second line 
of (5.55). The specific expressions for these three elements are not particularly 
enlightening. The important point is that we can make all three of them equal to 
zero since we have three degrees of freedom in the angles a, ft, and 7. Although, 
we do not expressly solve for the angles, we have demonstrated that it is always 
possible to set 

Xx'y' = 

= (5.56) 



This justifies (5.3). 



Appendix 5.C Electric Field in Crystals 

To determine the direction of the electric field associated with the each value 
of n, we return to (5.10), (5.11), and (5.12) in the analysis in section 5.2. These 
equations can be written in matrix format as 5 

^3" (l + Xx) ~ ky — k\ k x ky k x k z 

k x ky ^2 (l + Xy) ~ ~ kyk z 

k x kz kyk z (l + Xz) ~ k x — ky 

(5.57) 

where we have used k x + k^ + k 2 z = k 2 . We can divide every element by k 2 and 
employ the definitions (5.15), (5.17), and (5.18) to make this matrix equation look 

5 A. Yariv and E Yeh, Optical Waves in Crystals, Sect. 4.2 (New York: Wiley, 1984). 
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slightly nicer: 



Ux Uy 

u x u z 



U X Uy 

n y 2 _ 

„2 U X 



Uy Uz 



u x u z 

Uy Uz 



= 



(5.58) 



For (5.58) to have a non-trivial solution (i.e. non zero fields), the determinant 
of the matrix must be zero. Imposing this requirement is an equivalent way to 
derive Fresnel's equation (5.19) for n. 

Given a direction for u and a value for n (from Fresnel's equation), we can use 
(5.58) to determine the direction of the electric field associated with that index. 
It is left as an exercise to show that when all three components are nonzero (i.e. 
u x # 0, Uy ^ 0, and u z 7^0), the appropriate field direction for a value of n is given 
by 

U r 



E z 



CX 



n 2 - n 2 x 

u y 
n 2 - n 2 



(5.59) 



This is a proportionality rather than an equation because Maxwell's equations 
only specify the direction of E — we are free to choose the amplitude. Because 
Fresnel's equation gives two values for n, (5.59) specifies two distinct polarization 
components associated with each propagation direction u. These polarization 
components form a natural basis for describing light propagation in a crystal. 
When light is composed of a mixture of these two polarizations, the two polariza- 
tion components experience different indices of refraction. 

If any of the components of u (i.e. u x , u y , or u z ) is precisely zero, the corre- 
sponding entry in (5.59) yields a zero-over-zero situation. This happens when 
at least one of the dimensions in (5.58) becomes decoupled from the others. In 
these cases, you can and re-solve (5.58) for the polarization directions as in the 
following example. 



Example 5.4 

Determine the directions of the two polarization components associated with light 
propagating in the u = z direction. (Compare with Example 5.1.) 

Solution: In this case we have u x -u y - 0, so as noted above, we have to go back 
to (5.58) and re-solve. In our case, the set of equations becomes 



1 









1 



E v 



(5.60) 
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Notice that all three dimensions are decoupled in this system (i.e. there are no 
off-diagonal terms). In Example 5.1 we found that the two values of n associated 
with u = z are n x and n y . If we use n-n x in our set of equations, we have 




















E x 









Ey 
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E, 









Assuming n x and n y are unique so that n y l n x jt 1, these equations require E y = 
E z = but allow E x to be non-zero. This proves our earlier assertion that the index 
n x is associated with light polarized in the x-dimension in the special case of u = z. 
Similarly, when n y is inserted into (5.60), we find that it is associated with light 
polarized in the y-dimension. 



We can use (5.59) to study the behavior of polarization direction as the direc- 
tion of propagation varies. Figure 5.7 shows plots of the polarization direction (i.e. 
normalized E x , E y , and E z ) in Potassium Niobate as the propagation direction 
is varied. The plot is created by inserting the spherical representation of u (5.24) 
into Fresnel's equation (5.20) for a chosen sign of the ±, and then inserting the re- 
sulting n into (5.59) to find the associated electric field. As we saw in Example 5.4, 
at 8 = the light associated with the slow index is polarized along the y-axis and 
the light associated with the fast index is polarized along the x-axis. 

In Fig. 5.7(c) we have plotted the angle between the two polarization com- 
ponents. At 8 = 0, the two polarization components are 90° apart, as you might 
expect. However, notice that in other propagation directions the two linear polar- 
ization components are not precisely perpendicular. Even so, the two polarization 
components of E are orthogonal in a mathematical sense, 6 so that they still com- 
prise a useful basis for decomposing the light field. 



Determining the Fields in a Uniaxial Crystal. 

To find the directions of the electric field for light that experiences the normal 
index of refraction in a uniaxial crystal, we insert n - n into the requirement 
(5.58), and solve for the allowed fields (see P5.9) to find 



E (u)oc 



-sirup 

COS(/> 





(5.61) 



This field component is associated with the ordinary wave because just as in an 
isotropic medium such as glass, the index of refraction for light with this polariza- 
tion does not vary with 8. The polarization component associated with n e (8) is 



(a) Polarization Direction for Slow Index 
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(b) Polarization Direction for Fast Index 
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Figure 5.7 Polarization direction 
associated with the two values of n 
in Potassium Niobate (KNbOa) at 
A = 500 nm {n x = 2.22, n y = 2.34, 
and n z = 2.41) and (p-nlA. Frame 
(c) shows the angle between the 
two polarization components. 



The two components of the electric displacement vector D = erjE + P remain perpendiular. 
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found by using (5.59): 



E e (u)oc 



sin0cos(/> 
nlm-nl 



sin0sin(/> 




nH6)-nl 



(5.62) 



Notice that this polarization component is partially directed along the optic axis 
(i.e. it has a z-component), and it is not perpendicular to k since u • E e (u) ^ (see 
P5.10). It is, however, perpendicular to the ordinary polarization component, since 
E e -E o = 0. 

Notice that when = 0, (5.29) reduces to n — n so that both indices are the same. 
On the other hand, if 9 — nil then (5.29) reduces Van- n e . 



y-axis 




Appendix 5.D Huygens' Elliptical Construct for a Uniaxial 
Crystal 

In 1690 Christian Huygens developed a way to predict the direction of extraordi- 
nary rays in a crystal by examining an elliptical wavelet. The point on the elliptical 
wavelet that propagates along the optic axis is assumed to experience the index 
n e . The point on the elliptical wavlet that propagates perpendicular to the optic 
axis is assumed to experience the index n . It turns out that Huygens' approach 
agreed with the direction energy propagation (5.40) (as opposed to the direction 
of the k- vector). This was quite satisfactory in Huygens' day (except that he was 
largely ignored for a century, owing to Newton's corpuscular theory) since the 
direction of energy propagation is what an observer sees. 

Consider a plane wave entering a uniaxial crystal with the optic axis perpen- 
dicular to the surface. In Huygens' point of view, each point on a wave front acts 
as a wavelet source which combines with neighboring wavelets to preserve the 
overall plane wave pattern. Inside the crystal, the wavelets propagate in the shape 
of an ellipse. The equation for an elliptical wave front after propagating during a 
time t is 



r 



i 



(5.63) 



{ct/n e ) 2 (ct/n ) 2 

After rearranging, the equation of the ellipse inside the crystal can also be written 



as 



z = 




[ctn e 



(5.64) 



Figure 5.8 Elliptical wavelet. 



In order to have the wavelet joint neatly with other wavelets to build a plane wave, 
the wave front of the ellipse must be parallel to a new wave front entering the sur- 
face at a distance ct/ sin0j above the original point. This distance is represented 
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by the hypotenuse of the right triangle seen in Fig. 5.8. Let the point where the 
wave front touches the ellipse be denoted by [y, z) = Cztan0s, z). The slope (rise 
over run) of the line that connects these two points is then 



dz 
dy 



c27sin0j-.ztan0s 



(5.65) 



At the point where the wave front touches the ellipse (i.e., (y, z) = Oztan0s, z)), the 
slope of the curve for the ellipse is 



dz 
dy 



n ct x \- 



„2, „2 



= — | tan S 



(5.66) 



(ctln e 



We would like these two slopes to be the same. We therefore set them equal to 
each other: 



--§tan0 s = - 



z ct nl tan0 s 

cf/sin0j - z tan 0s z nl sinSj 



tan 2 S + 1 (5.67) 



If we evaluate (5.63) for the point (y, z) = (ztanOs, z), we obtain 



ct 

7 = *>1 



^|tan2 s + l 



(5.68) 



Upon substitution of this into (5.67) we arrive at 



nl tan0 s _ \n\ 
n 2 sin0i y n\ 



tan 2 S + 1 



nj tan 2 S 
nl sin 2 0i 



= -|tan 2 s + l 



sin 2 0i 



-1 



tan 2 S = 



tan 0s = 



n sin0j 



n e\j n|-sin 2 0; 



(5.69) 
(5.70) 



This agrees with (5.40) as anticipated. Again, Huygens' approach obtained the 
correct direction of the Poynting vector associated with the extraordinary wave. 
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Exercises 

Exercises for 5.2 Plane Wave Propagation in Crystals 

P5.1 Solve Fresnel's equation (5.19) to find the two values of n associated 
with a given u. Show that both solutions yield a positive index of 
refraction 

HINT: Show that (5.19) can be manipulated into the form 



P5.2 Suppose you have a crystal with n x = 1.5, n y = 1.6, and n z = 2.0. Use 
Fresnel's equation to determine what the two indices of refraction are 
for a k- vector in the crystal along the u = (x + 2y + 3z) / s/\4 direction. 

Exercises for 5.3 Biaxial and Uniaxial Crystals 

P5.3 Given that the optic axes are in the x-z plane, show that the direction 
of the optic axes are given by (5.25). 

HINT: The two indices are the same when B 2 - 4 AC = 0. You will want 
to use polar coordinates for the direction unit vector, as in (5.24). Set 
(p = so you are in the x-z plane. Use sin 2 + cos 2 = 1 to get an 
equation that only has cosine terms and solve for cos 2 6. 

P5.4 Use definitions (5.26) and (5.27) along with the spherical representa- 
tion of u (5.24) in Fresnel's equation (5.20) to calculate the two values 
for the index in a uniaxial crystal (i.e. (5.28) and (5.29)). 

HINT: First show that 




The coefficient of n 6 is identically zero since by definition we have 

U 2 X + Uy+ U 2 Z = l. 




and then use these expressions to evaluate Fresnel's equation. 



P5.5 



Derive (5.32). 
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P5.6 



L5.7 



Suppose you have a quartz plate (a uniaxial crystal) with its optic axis 
oriented perpendicular to the surfaces. The indices of refraction for 
quartz are n = 1.54424 and n e = 1.55335. A plane wave with wave- 
length A V ac = 633 nm passes through the plate. After emerging from 
the crystal, there is a phase difference A between the two polarization 
components of the plane wave, and this phase difference depends on 
incident angle 0[. Use a computer to plot A as a function of incident 
angle from zero to 90° for a plate with thickness d = 0.96 mm . 

HINT: For s-polarized light, show that the number of wavelengths that 
fit in the plate is -r* . For p-polarized light, show that the 

(A va c/n o )cos0; j 

number of wavelengths that fit in the plate and the extra leg 8 outside 
of the plate (see Fig. 5.9) is 



(A vac /«p)cos0 



I/O 



+ where 



S = d 



tan0i s) -tan0 



sin0j 



and n p is given by (5.29). Find the difference between these expressions 
and multiply by 2n to find A. 

In the laboratory, send a HeNe laser (A vac = 633 nm) through two 
crossed polarizers, oriented at 45° and 135°. Place the quartz plate 
described in P5.6 between the polarizers on a rotation stage. Now 
equal amounts of s- and p-polarized light strike the crystal as it is 
rotated from normal incidence, (video) 



| Laser | 









1 — 


1 — w — 1 





Polarizer 



Quartz Crystal 
on a rotation stage 



Polarizer Screen 



Figure 5.11 Schematic for L 5.7. 

If the phase shift between the two paths discussed in P5.6 is an odd 
integer times n, the polarization direction of the light transmitted 
through the crystal is rotated by 90°, and the maximum transmission 
through the second polarizer results. (In this configuration, the crystal 
acts as a half wave plate, which we discuss in Chapter 6. If the phase 
shift is an even integer times n, then the polarization is rotated by 180° 
and minimum transmission through the second polarizer results. Plot 
these measured maximum and minimum points on your computer- 
generated graph of the previous problem. 




Figure 5.9 Diagram for P5.6. 
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Figure 5.10 Plot for P5.6 and L 5.7. 



Exercises for 5. C Electric Field in Crystals 
P5.8 Show that (5.59) is a solution to (5.58) . 
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P5.9 Show that the field polarization component associated with n = n Q in 
a uniaxial crystal is directed perpendicular to the plane containing u 
and z by substituting this value for n into (5.58) and determining what 
combination of field components are allowable. 

HINT: Use (5.24) to represent u with cf> = (the index is the same for all 
(p, so you may as well use one that makes calculation easy). When you 
substitute into (5.58) you will find that E y can be any value because of 
the location of zeros in the matrix. 

To get a requirement on E x and E z , collapse the matrix equation down 
to a 2 x 2 system. For non-trivial solutions to exist (i.e. E x ^ or E y # 0), 
the determinant of the matrix must be zero. Show that this is only the 
case if n = n e (i.e. the crystal is isotropic). 



P5.10 Show that the electric field for extraordinary polarized light E e (u) in a 
uniaxial crystal is not perpendicular to k (i.e. u), but that it is perpen- 
dicular to the ordinary polarization component E (u). 



Review, Chapters 1-5 



To prepare for an exam, you should understand the following questions and 
problems thoroughly enough to be able to work them without referring back to 
previous chapters. 



True and False Questions 

Rl T or F: The optical index of any material (not vacuum) varies with 
frequency 

R2 T or F: The frequency of light can change as it enters a crystal (consider 
low intensity — no nonlinear effects) . 

R3 T or F: The entire expression E e !(k ' r_tl,r) associated with a light field 
(both the real part and the imaginary parts) is physically relevant. 

R4 T or F: The real part of the refractive index cannot be less than one. 

R5 T or F: s-polarized light and p-polarized light experience the same 
phase shift upon reflection from a material with complex index. 

R6 T or F: When light is incident upon a material interface at Brewster's 
angle, only one polarization can transmit. 

R7 T or F: When light is incident upon a material interface at Brewster's 
angle one of the polarizations stimulates dipoles in the material to 
oscillate with orientation along the direction of the reflectedk-vector. 

R8 T or F: The critical angle for total internal reflection exists on both sides 
of a material interface. 

R9 T or F: From any given location above a (smooth flat) surface of water, 
it is possible to see objects positioned anywhere under the water. 

RIO T or F: From any given location beneath a (smooth flat) surface of water, 
it is possible to see objects positioned anywhere above the water. 

Rl 1 T or F: An evanescent wave travels parallel to an interface surface on 
the transmitted side. 
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R12 T or F: When p-polarized light enters a material at Brewster's angle, the 
intensity of the transmitted beam is the same as the intensity of the 



R13 T or F: For incident angles beyond the critical angle for total internal 
reflection, the Fresnel coefficients t s and t p are both zero. 

R14 T or F: It is always possible to completely eliminate reflections using a 



single-layer antireflection coating if you are free to choose the coating 
thickness but not its index. 



R15 T or F: For a given incident angle and value of n, there is only one 
single-layer coating thickness d that will minimize reflections. 

R16 T or F: When coating each surface of a lens with a single-layer antire- 
flection coating, the thickness of the coating on the exit surface will 
need to be different from the thickness of the coating on the entry 
surface. 

Rl 7 T or F: As light enters a crystal, the Poynting vector always obeys Snell's 
law. 

R18 T or F: As light enters a crystal, the k-vector does not obey Snell's law 
for the extraordinary wave. 



R19 (a) Write down Maxwell's equations. 

(b) Derive the wave equation for E under the assumptions that Jf ree = 
and P = e xE. Note: Vx(Vxf) = V(V-f) - V 2 f. 

(c) Show by direct substitution that E (r, t) = E e !tk ' r ~ wf) is a solution to 
the wave equation. Find the resulting connection between k and a). 
Give appropriate definitions for c and n, assuming that % is real. 

(d) If k = kz and E = E x, find the associated B-field. 



incident beam. 



Problems 




R20 



The plane of incidence is shown in Fig. 5.12 

(a) By inspection of the figure, write down similar expressions for the 
reflected and transmitted fields (i.e. E r and E t ). 



Consider an interface between two isotropic media where the incident 
field is defined by 



(e) The Poynting vector is S = E x B/jU , where the fields are real. Derive 
an expression for / = (S) t . 



Ei = 



£! p) (ycos0i-zsin0i)+i£f ] e '[feCj"«»nft+*«»0i)-<»it] 



Figure 5.12 
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R21 



R22 



;ential component 



(b) Find an expression relating Ej, E r , and E t using the boundary condi- 
tion at the interface. From this expression obtain the law of reflection 
and Snell's law. 

(c) The boundary condition requiring that the tangi 
of B must be continuous leads to 

ni(£j s) - Ej*) cos0i = n t E { * cos0 t 
Use this and the results from part (b) to derive 

ET _ tenCgi-gi 

~eT~ 



'" ^ ,p) tan(0i + t ) 



You may use the identity 

sin0icos0i-sin0 t cos0 t _ tan(0i-0 t ) 
sin0icos0i + sin0 t cos0 t tan(0i + t ) 



The Fresnel equations are 



Ef 


sin0 t cos0j - sin0icos0 t 


Ef~~ 


sin 9 t cos 8i + sin 9[ cos 9 t 


E[ s) 


2sin0 t cos0i 




sin 6 t cos 8i + sin 9i cos 8 t 


E [ f 


cos 6t sin 9 t - cos 8i sin 9i 


Ef~~ 


cos0 t sin<9 t + cos0i sin0i 


E[ p) 


2cos0j sin0 t 


i 


cos 9 1 sin 6 t + cos 9[ sin 9\ 



I " 1 

) when 9[ = 0. Give your 



tuiavvci in tciiiio ui ii\ aiiu /if 

(b) What percent of light (intensity) reflects from a glass surface {n - 
1.5) when light enters from air {n = 1) at normal incidence? 

(c) What percent of light reflects from a glass surface when light exits 
into air at normal incidence? 

Light goes through a glass prism with optical index n = 1.55. The light 
enters at Brewster's angle and exits at normal incidence as shown in 
Fig. 5.13. 

(a) Derive and calculate Brewster's angle 0b- You may use the results of 
R20 (c). 




Figure 5.13 
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(b) Calculate (p. 

(c) What percent of the light (power) goes all the way through the prism 
if it is p-polarized? You may use the Fresnel coefficients given in R21. 

(d) What percent for s-polarized light? 

A 45°- 90°- 45° prism is a good device for reflecting a beam of light 
parallel to the initial beam (see Fig. 5.14). The exiting beam will be 
parallel to the entering beam even when the incoming beam is not 
normal to the front surface (although it needs to be in the plane of the 
drawing). 

(a) How large an angle 9 can be tolerated before there is no longer total 
internal reflection at both interior surfaces? Assume n = 1 outside of 
the prism and n = 1.5 inside. 

(b) If the light enters and leaves the prism at normal incidence, what 
will the difference in phase be between the 5 and p-polarizations? You 
may use the Fresnel coefficients given in R21. 

R24 A thin glass plate with index n = 1.5 is oriented at Brewster's angle so 
that p-polarized light with wavelength A vac = 500 nm goes through 
with 100% transmittance. 

(a) What is the minimum thickness that will make the reflection of 
s-polarized light be maximum? 

(b) What is the total transmittance rj ot for this thickness assuming 
5-polarized light? 

R25 Consider a Fabry-Perot interferometer. Note: i?i = i?2 = R. 

(a) Show that the free spectral range for a Fabry-Perot interferometer is 

A 2 

AApsR = : — 

2ndcos6 




(b) Show that the fringe width is 

A 2 



AA 



FWHM 



n\/Fndcos0 



where F=^. 

(c) Derive the reflecting finesse / = AA FSR / AA FWHM . 

R26 For a Fabry-Perot etalon, let R = 0.90, A vac = 500 nm, n = 1, and d = 
5.0 mm. 

(a) Suppose that a maximum transmittance occurs at the angle = 0. 
What is the nearest angle where the transmittance will be half of the 
maximum transmittance? You may assume that cos e = i-e 2 i2. 
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R27 



R28 



(b) You desire to use a Fabry- Perot etalon to view the light from a large 
diffuse source rather than a point source. Draw a diagram depicting 
where lenses should be placed, indicating relevant distances. Explain 
briefly how it works. 

You need to make an antireflective coating for a glass lens designed to 
work at normal incidence. 

The matrix equation relating the incident field to the reflected and 
transmitted fields (at normal incidence) is 



1 




1 


E^ _ 




+ 






n j 




-n 


E^ 



cos k\£ — sin k-i £ 
-inisink\£ cosk\£ 



1 

"2 



(a) What is the minimum thickness the coating should have? 

HINT: It is less work if you can figure this out without referring to the 
above equation. You may assume n\<ri2- 

(b) Find the index of refraction n\ that will make the reflectivity be zero. 

Second harmonic generation (the conversion of light with frequency (D 
into light with frequency 2(d) can occur when very intense laser light 
travels in a material. For good harmonic production, the laser light 
and the second harmonic light need to travel at the same speed in the 
material. In other words, both frequencies need to have the same index 
of refraction so that harmonic light produced down stream joins in 
phase with the harmonic light produced up stream, referred to as phase 
matching. This ensures a coherent building of the second harmonic 
field rather than destructive cancellations. 

Unfortunately, the index of refraction is almost never the same for dif- 
ferent frequencies in a given material, owing to dispersion. However, 
we can achieve phase matching in some crystals where one frequency 
propagates as an ordinary wave and the other propagates as an extraor- 
dinary wave. We cause the two indices to be precisely the same by 
tuning the angle of the crystal. 

Consider a ruby laser propagating and generating the second harmonic 
in a uniaxial KDP crystal (potassium dihydrogen phosphate). The 
indices of refraction are given by n and 

n n e 



nl sin 2 9 + n\ cos 2 6 



where 6 is the angle made with the optic axis. At the frequency of a 
ruby laser, KDP has indices n {(d) = 1.505 and n e {(d) = 1.465. At the 
frequency of the second harmonic, the indices are n {2(d) = 1.534 and 
n e {2(D) = 1.487. 
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n 2 = 1.55 



Figure 5.15 
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Show that phase matching can be achieved if the laser is polarized so 
that it experiences only the ordinary index and the second harmonic 
light is polarized perpendicular to that. At what angle 6 does this phase 
matching occur? 

Selected Answers 

R21: (b) 4% (c) 4%. 

R22: (b) 33°, (c) 95%, (d) 79%. 

R23: (a) 4.8°, (b) 74°. 

R24: (a) 100 nm. (b) 0.55. 

R26: (a) 0.074°. 

P27: (b) 1.24. 



R28: 51.12°. 



Chapter 6 

Polarization of Light 




When the direction of the electric field of light oscillates in a regular, predictable 
fashion, we say that the light is polarized. Polarization describes the direction 
of the oscillating electric field, a distinct concept from dipoles per volume in a 
material P - also called polarization. In this chapter, we develop a formalism for 
describing polarized light and the effect of devices that modify polarization. If the 
electric field oscillates in a plane, we say that it is linearly polarized. The electric 
field can also spiral around while a plane wave propagates, and this is called 
elliptical polarization. There is a convenient way for keeping track of polarization 
using a two-dimensional Jones vector. 

Many devices can affect polarization such as polarizers and wave plates. Their 
effects on a light field can be represented by 2 x 2 Jones matrices that operate on 
the Jones vector representing the light. A Jones matrix can describe, for example, 
a linear polarizer oriented at an arbitrary angle with respect to the coordinate 
system. Likewise, a Jones matrix can describe the manner in which a wave plate 
introduces a relative phase between two components of the electric field. A wave 
plate can be used to convert, for example, linearly polarized light into circularly 
polarized light. 

In this chapter, we will also see how reflection and transmission at a material 
interface influences field polarization. The Fresnel coefficients studied in chap- 
ters 3 and 4 can be conveniently incorporated into the 2x2 matrix formulation for 
handling polarization. As we saw previously, the amount of light reflected from a 
surface depends on the type of polarization, 5 or p. In addition, upon reflection, 
5-polarized light can acquire a phase lag or phase advance relative to p-polarized 
light. This is especially true at metal surfaces, which have complex indices of 
refraction. Ellipsometry, outlined in appendix 6.A, is the science of characterizing 
optical properties of materials through an examination of these effects. 

Throughout this chapter, we consider light to have well characterized polar- 
ization. However, most common sources of light (e.g. sunlight or a light bulb) 
have an electric-field direction that varies rapidly and randomly. Such sources 
are commonly referred to as unpolarized. It is common to have a mixture of 
unpolarized and polarized light, called partially polarized light. The Jones vector 



Figure 6.1 Animation showing 
different polarization states of 
light. 
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formalism used in this chapter is inappropriate for describing the unpolarized 
portions of the light. In appendix 6.B we describe a more general formalism for 
dealing with light having an arbitrary degree of polarization. 




Figure 6.2 The combination of 
two orthogonally polarized plane 
waves that are out of phase results 
in elliptically polarized light. Here 
we have left circularly polarized 
light created as specified by (6.3). 



6. 1 Linear, Circular, and Elliptical Polarization 

Consider the plane-wave solution to Maxwell's equations given by 

E(.r,t) = E e iik - r -" t] (6.1) 

The wave vector k specifies the direction of propagation. We neglect absorption 
so that the refractive index is real and k = na)/c = 2nnl 'A vac (see (2.19)-(2.24)). In 
an isotropic medium we know that k and E are perpendicular, but even after the 
direction of k is specified, we are still free to have E point anywhere in the two 
dimensions perpendicular to k. If we orient our coordinate system with the z-axis 
in the direction of k, we can write (6.1) as 



E(z, t) = [E x x+E y f)e 



i(kz-ct)t) 



(6.2) 



As always, only the real part of (6.2) is physically relevant. The complex amplitudes 
of E x and E y keep track of the phase of the oscillating field components. In 
general the complex phases of E x and E y can differ, so that the wave in one of the 
dimensions lags or leads the wave in the other dimension. 

The relationship between E x and E y describes the polarization of the light. 
For example, if E y is zero, the plane wave is said to be linearly polarized along the 
x-dimension. Linearly polarized light can have any orientation in the x-y plane, 
and it occurs whenever E x and E y have the same complex phase (or a phase 
differing by an integer times n). For our purposes, we will take the x-dimension 
to be horizontal and the y-dimension to be vertical unless otherwise noted. 

As an example, suppose E y = iE x , where E x is real. The y-component of the 
field is then out of phase with the x-component by the factor i = e 1Kl2 . Taking the 
real part of the field (6.2) we get 



E (z, t) = Re 



E x e 



i(kz-cot) 



x + Re 



Anl2 



E x e 



i(kz-b)t) 



= E x cos {kz-tot)x+ E x cos (kz-a>t + n/2)y 
= E x [cos(fcz-o>f)x-sin(A;z-a>£)y] 



(left circular) (6.3) 



In this example, the field in the y-dimension lags behind the field in the x- 
dimension by a quarter cycle. That is, the behavior seen in the x-dimension 
happens in the y-dimension a quarter cycle later. The field never goes to zero 
simultaneously in both dimensions. In fact, in this example the strength of the 
electric field is constant, and it rotates in a circular pattern in the x-y dimensions. 
For this reason, this type of field is called circularly polarized. Figure 6.2 graph- 
ically shows the two linear polarized pieces in (6.3) adding to make circularly 
polarized light. 
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If we view a circularly polarized light field throughout space at a frozen instant 
in time (as in Fig. 6.2), the electric field vector spirals as we move along the z- 
dimension. If the sense of the spiral (with time frozen) matches that of a common 
wood screw oriented along the z-axis, the polarization is called right handed. (It 
makes no difference whether the screw is flipped end for end.) If instead the field 
spirals in the opposite sense, then the polarization is called left handed. The field 
shown in Fig. 6.2 is an example of left-handed circularly polarized light. 

An equivalent way to view the handedness convention is to imagine the light 
impinging on a screen as a function of time. The field of a right-handed circularly 
polarized wave rotates counter clockwise at the screen, when looking along the k 
direction (towards the front side of the screen) . The field rotates clockwise for a 
left-handed circularly polarized wave. 

Linearly polarized light can become circularly or, in general, elliptically po- 
larized after reflection from a metal surface if the incident light has both s- and 
p-polarized components. A good experimentalist working with light needs to 
know this. Reflections from multilayer dielectric mirrors can also exhibit these 
phase shifts. 

6.2 Jones Vectors for Representing Polarization 

In 1941, R. Clark Jones introduced a two-dimensional matrix algebra that is useful 
for keeping track of light polarization and the effects of optical elements that 
influence polarization. 1 The algebra deals with light having a definite polarization, 
such as plane waves. It does not apply to un-polarized or partially polarized light 
(e.g. sunlight). For partially polarized light, a four- dimensional algebra known as 
Stokes calculus is used (see Appendix 6.B). 

In preparation for introducing Jones vectors, we explicitly write the complex 
phases of the field components in (6.2) as 



E(z, t) = (\E x \e i ^±+\E y \e i( fyy)e 
and then factor (6.4) as follows: 



i{kz-t)t) 



E (z, t) = E eS (Ax+ Be iS yj e 



i(kz-iot) 



where 



E e n = 
A = 

B = 



E x \ 2 + \E v \~e 



E X \ 2 + \E V 



E x \ 2 + \E y 



(6.4) 
(6.5) 

(6.6) 
(6.7) 

(6.8) 



R. Clark Jones (1916-2004, American) 
was born in Toledo Ohio. He was one 
of six high school seniors to receive a 
Harvard College National Prize Fellow- 
ship. He earned both his undergraduate 
(summa cum laude 1938) and Ph.D. 
degrees from Harvard (1941). After 
working several years at Bell Labs, he 
spent most of his professional career 
at Polaroid Corporation in Cambridge 
MA, until his retirement in 1982. He 
is well-known for a series of papers on 
polarization published during the period 
1941-1956. He also contributed greatly 
to the development of infrared detectors. 
He was an avid train enthusiast, and 
even wrote papers on railway engineer- 
ing. See J. Opt. Soc. Am. 63, 519-522 
(1972). Also see SPIE oemagazine, 
p. 52 (Aug. 2004). 



X E. Hecht, Optics, 3rd ed., Sect. 8.12.2 (Massachusetts: Addison-Wesley, 1998). 
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6 = (py-(f) x 



(6.9) 



Linearly polarized along x 



Linearly polarized along y 



Linearly polarized at angle a 
(measured from the x-axis) 



Right circularly polarized 

1 [ 1 

v/2 I -i . 

Left circularly polarized 



1 



Table 6.1 Jones Vectors for several 
common polarization states. 



Please notice that A and B are real non-negative dimensionless numbers that 
satisfy A 2 + B 2 = 1. If E y is zero, then 5 = and everything is well-defined. On the 
other hand, if E x happens to be zero, then its phase e l ^ x is indeterminant. In this 
case we let E eS = \E y \e i(p y, 5 = 1, and 5 = 0. 

The overall field strength E e $ is often unimportant in a discussion of polariza- 
tion. It represents the strength of an effective linearly polarized field that would 
correspond to the same intensity as (6.4). Specifically, from (6.5) and (2.62) we 
have 

I=(S) t = -nce E-E* = -nce Q \E eff \ 2 (6.10) 

The phase of E e g represents an overall phase shift that one can trivially adjust by 
physically moving the light source (a laser, say) forward or backward by a fraction 
of a wavelength. 

The portion of (6.5) that is relevant to our discussion of polarization is the 
vector Ax+Be lS y, referred to as the Jones vector. This vector contains the essential 
information regarding field polarization. Notice that the Jones vector is a kind 
of unit vector, in that {Ax+ Be l5 y) ■ {Ax+ Be lS y)* = 1. (The asterisk represents 
the complex conjugate.) When writing a Jones vector we dispense with the x and 
y notation and organize the components into a column vector (for later use in 
matrix algebra) as follows: 

A 
Be iS 



(6.11) 



This vector can describe the polarization state of any plane wave field. Table 6.1 
lists some Jones vectors representing various polarization states. 

6.3 Elliptically Polarized Light 



In general, the Jones vector (6.11) represents a polarization state between linear 
and circular. This 'between' state is known as elliptically polarized light. As 
the wave travels, the field vector makes a spiral motion. If we observe the field 
vector at a point as the field goes by, the field vector traces out an ellipse oriented 
perpendicular to the direction of travel (i.e. in the x-y plane). One of the axes of 
the ellipse occurs at the angle 



1 

a = - tan 

2 



2ABCOS5 1 
[ A 2 -B 2 , 



(6.12) 



with respect to the x-axis (see P6.8). This angle sometimes corresponds to the 
minor axis and sometimes to the major axis of the ellipse, depending on the exact 
values of A, B, and 5. The other axis of the ellipse (major or minor) then occurs at 
a±n/2 (see Fig. 6.3). 

We can deduce whether (6.12) corresponds to the major or minor axis of the 
ellipse by comparing the strength of the electric field when it spirals through the 
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direction specified by a and when it spirals through a±nl2. The strength of the 
electric field at a is given by (see P6.8) 

E a = \E e ff\ V A 2 cos 2 a + B 2 sin 2 a + AB cos<5sin2a (£ mai or E min ) (6.13) 

and the strength of the field when it spirals through the orthogonal direction 
(a + n 12) is given by 

E a ±7ii2 = l-Eeffl A 2 sin 2 a + B 2 cos 2 oc — AB cos 5 sin 2a (£ mai or E m]n ) (6.14) 

After computing (6.13) and (6.14), we decide which represents E min and which 
Emm according to 

^max — ^min (6.15) 



We could predict in advance which of (6.13) or (6.14) corresponds to the major 
axis and which corresponds to the minor axis. However, making this prediction is 
as complicated as simply evaluating (6.13) and (6.14) and determining which is 
greater. 

Elliptically polarized light is often characterized by the ellipticity, given by the 
ratio of the minor axis to the major axis: 



(6.16) 



The ellipticity e ranges between zero (corresponding to linearly polarized light) 
and one (corresponding to circularly polarized light). Finally, the helicity or 
handedness of elliptically polarized light is as follows (see P6.2): 

0<5<7r — left-handed helicity (6.17) 

n<5<2n -* right-handed helicity (6.18) 

6.4 Linear Polarizers and Jones Matrices 

In 1928, Edwin Land invented Polaroid at the age of nineteen. He did it by stretch- 
ing a polymer sheet and infusing it with iodine. The stretching caused the polymer 
chains to align along a common direction, whereupon the sheet was cemented 
to a substrate. The infusion of iodine caused the individual chains to become 
conductive, like microscopic wires. 

When light impinges upon a Polaroid sheet, the component of electric field 
that is parallel to the polymer chains causes a current J free to oscillate in that 
dimension. The resistance to the current quickly dissipates the energy (i.e. the re- 
fractive index is complex) and the light is absorbed. The thickness of the Polaroid 
sheet is chosen sufficiently large to ensure that virtually none of the light with 
electric field component oscillating along the chains makes it through the device. 

The component of electric field that is orthogonal to the polymer chains 
encounters electrons that are essentially bound to the narrow width of individual 
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Figure 6.3 The electric field of el- 
liptically polarized light traces an 
ellipse in the plane perpendicular 
to its propagation direction. The 
two plots are for different values 
of A, B, and 6. The angle a can de- 
scribe the major axis (top) or the 
minor axis (bottom), depending 
on the values of these parameters. 
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Arbitrary incident 
polarization v 



Transmission Axis 




Transmitted polarization 
component 

Figure 6.4 Light transmitting 
through a Polaroid sheet. The 
conducting polymer chains run 
vertically in this drawing, and 
light polarized along the chains 
is absorbed. Light polarized per- 
pendicular to the polymer chains 
passes through the polarizer. 



polymer molecules. For this polarization component, the wave passes through 
the material much like it does through typical dielectrics such as glass (i.e. the 
refractive index is real). Today, there is a wide variety of technologies for making 
polarizers, many very different from Polaroid. 

A polarizer can be represented as a 2 x 2 matrix that operates on Jones vectors. 2 
The function of a polarizer is to pass only the component of electric field that 
is oriented along the polarizer transmission axis. Thus, if a polarizer is oriented 
with its transmission axis along the x-dimension, then only the x-component 
of polarization transmits; the y-component is killed. If the polarizer is oriented 
with its transmission axis along the y-dimension, then only the y-component of 
the field transmits, and the x-component is killed. These two scenarios can be 
represented with the following Jones matrices: 



1 





1 



(polarizer with transmission along x-axis) (6.19) 



(polarizer with transmission along y-axis) (6.20) 



These matrices operate on any Jones vector representing the polarization of 
incident light. The result gives the Jones vector for the light exiting the polarizer. 



Example 6.1 

Use the Jones matrix (6.19) to calculate the effect of a horizontal polarizer on 
light that is initially horizontally polarized, vertically polarized, and arbitrarily 
polarized. 



Solution: First we consider a horizontally polarized plane wave traversing a polar- 
izer with its transmission axis oriented also horizontally (x-dimension): 

(horizontal polarizer on horizontally polarized field) 

As expected, the polarization state is unaffected by the polarizer. (We have ignored 
possible attenuation from surface reflections.) 

Now consider vertically polarized light traversing the same horizontal polarizer. In 
this case, we have: 

(horizontal polarizer on vertical linear polarization) 
As expected, the polarizer extinguishes the light. 

Finally, when a horizontally oriented polarizer operates on light with an arbitrary 
lones vector (6.11), we have 

(horizontal polarizer on arbitrary polarization) 

Only the horizontal component of polarization is transmitted through the polar- 
izer. 

2 E. Hecht, Optics, 3rd ed., Sect. 8.12.3 (Massachusetts: Addison-Wesley, 1998). 
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While you might readily agree that the matrices given in (6.19) and (6.20) 
can be used to get the right result for light traversing a horizontal or a vertical 
polarizer, you probably aren't very impressed as of yet. In the next few sections, 
we will derive Jones matrices for a number of optical elements that can modify 
polarization: polarizers at arbitrary angle, wave plates at arbitrary angle, and 
reflection or transmissions at an interface. Table 6.2 shows Jones matrices for 
each of these devices. Before deriving these specific Jones matrices, however, we 
take a moment to appreciate why the Jones matrix formulation is useful. 

The real power of the formalism becomes clear as we consider situations 
where light encounters multiple polarization elements in sequence. In these situ- 
ations, we use a product of Jones matrices to represent the effect of the compound 
systems. We can represent this situation by 



(6.21) 



where the unprimed Jones vector represents light going into the system and the 
primed Jones vector represents light emerging from the system. In general, A' 
and B' will turn out to be complex. However, if desired they can be changed into 
the usual form by writing 



A' 




A 


B' 


J system 


Be iS 



' A' 1 


= e i<t> A < 


\A'\ 


B' 






. \B'\e iS ' . 



where <pA< is an unimportant overall phase, and 8' is the phase difference between 
B' and A'. 

The matrix J system is a Jones matrix formed by the series polarization devices. 
If there are N devices in the system, the compound matrix is calculated as 



J system = ^N^N-l ' 



J 2 Jl 



(6.22) 



where J n is the matrix for the n th polarizing optical element encountered in the 
system. Notice that the matrices operate on the Jones vector in the order that 
the light encounters the devices. Therefore, the matrix for the first device (J i) is 
written on the right, and so on until the last device encountered, which is written 
on the left, farthest from the Jones vector. 

When part of the light is absorbed by passing through one or more polarizers 
in a system, the Jones vector of the exiting light does not necessarily remain 
normalized to magnitude one (see Example 6.1). Since the components of a Jones 
vector represent the electric field, we find the factor by which the intensity of the 
light decreases by dotting the vector with its complex conjugate. In accordance 
with (6.10), the intensity of the exiting light is 



= ^nce Q |£ef f | 2 (A'x + B' e iS 'yj ■ (a'x+ B 1 e iS 'yj* 
= ^nce \E ei{ \ 2 (\A'\ 2 + \B'\ 



(6.23) 



Linear polarizer 



cos 8 sinflcosfl 
sin 8 cos 8 sin 2 8 



Half wave plate 



cos20 sin20 
sin 28 -cos 28 



Quarter wave plate 

cos 2 0+i'sin 2 (1 - /)sin(?cos0 
(1 - /)sin0cos0 sin 2 8+icos 2 8 

Right circular polarizer 



1 i 
-i 1 



Left circular polarizer 



1 -i 

i 1 



Reflection from an interface 

-Tp 

r s 

Transmission through an 
interface 



Table 6.2 Common fones Matri- 
ces. The angle 8 is measured with 
respect to the x-axis and specifies 
the transmission axis of a linear 
polarizer or the fast axis of a wave 
plate. 
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Notice that the intensity is attenuated by the factor \A'\ +\B'\ after propagating 
through the system. Recall that E e g represents the effective strength of the field 
before it enters the polarizer (or other device), so that the initial Jones vector is 
normalized to one (see (6.10)). As a reminder, we normally remove an overall 
phase factor from the Jones vector so that A' is real and non-negative, and we 
choose 5' so that B' is real and non-negative. However, if we don't bother doing 
this, the absolute value signs on A' and B' in (6.23) ensure that we get the correct 
value for intensity. 



6. 5 Jones Matrix for Polarizers at Arbitrary Angles 



Incident Light 



Transmission 
Axis 




Transmitted 
component 



Figure 6.5 Light transmitting 
through a polarizer oriented with 
transmission axis at angle d from 
x-axis. 




Figure 6.6 Electric field compo- 
nents written in the e\-&2 basis. 



In this section we develop a Jones matrix for describing an ideal polarizer with 
its transmission axis at an arbitrary angle 6 from the x-axis. We will do this in a 
general context so that we can take advantage of the present work when discussing 
wave plates in the next section. To help keep things on a more conceptual level, 
we revert back to using electric field components directly. We will make the 
connection with Jones calculus at the end. 

The polarizer acts on a plane wave with arbitrary polarization. The electric 
field of our plane wave may be written as 



E(z, t) = [E x ± + E y y)e 



i(kz-cot) 



(6.24) 



Let the transmission axis of the polarizer be specified by the unit vector ei 
and the absorption axis of the polarizer be specified by e 2 (orthogonal to the 
transmission axis). The vector ei is oriented at an angle 8 from the x-axis, as 
shown in Fig. 6.6. We need to write the electric field components in terms of the 
new basis specified by ei and e 2 . By inspection of the geometry, the x-y unit 
vectors are connected to the new coordinate system via: 



x = cos0ei - sin0e2 
y = sinSei + cosf?e2 

Substitution of (6.25) into (6.24) yields for the electric field 



E(z, t) = {Eiei + E 2 e 2 )e 



Hkz-cot) 



(6.25) 



(6.26) 



(6.27) 



where 

Ei = E x cosf9 + E y sin0 
E 2 = —E x sin + Ey cos 6 

Now we introduce the effect of the polarizer on the field: £1 is transmitted 
unaffected, while £2 is extinguished. To account for the effect of the device, we 
multiply £2 by a parameter In the case of the polarizer, £ is zero, but when we 
consider wave plates we will use other values for £ . After traversing the polarizer, 
the field becomes 

Eafter {z, = + {E 2 e 2 ) e i(kz ~ M) (6.28) 
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(6.29) 



We now have the field after the polarizer, but it would be nice to rewrite it in terms 
of the original x-y basis. By inverting (6.25), or again by inspection of Fig. 6.6, we 
see that 

ei = cos0x + sin0y 

62 = -sin0x + cos0y 

Substitution of these relationships into (6.28) together with the definitions (6.27) 
for E\ and £2 yields 

E a fter Cz> t) = [ [E x cos 9 + E y sin0) (cos0x + sinSy) 

H[-E x sin0 + E y costf) (- sin0x+ cosfly)] e i *«-« u « 

= [E x (cos 2 9 + £ sin 2 9) + E y (sin0 cos0 - £sin0 cos0)] xe i(kz - wt) 

+ [E x (sin0 cos 0-t sind cos 8) + E y (sin 2 8 + £, cos 2 0) ] t] 

(6.30) 

Notice that if £ = 1 (i.e. no polarizer), then we get back exactly what we started 
with (i.e. (6.30) reduces to (6.24)). 

To get to the Jones matrix for the polarizer, we note that (6.30) is a linear mix- 
ture of E x and E y which can be represented with matrix algebra. If we represent 
the electric field as a two-dimensional column vector with its x component in the 
top and its y component in the bottom (like a Jones vector) , then we can rewrite 
(6.30) as 



Eafter (Z, t) - 



cos 2 9 + £ sin 2 ( 



sin 8 cos - £ sin 6 cos t 
sin0cos0-^sin0cos0 sin 2 + ^ cos 2 9 



E x 



i{kz-oit) 



(6.31) 



The matrix here is a properly normalized Jones matrix, even though we did not 
bother factoring out £ e ff to make a properly normalized Jones vector, as specified 
in (6.5). We can now write down the Jones matrix for a polarizer by inserting <f = 
into the matrix: 



cos 2 f3 sin 9 cos 8 
sin 9 cos 8 sin 2 8 



(polarizer with transmission axis at angle 9) (6.32) 



Notice that when = this matrix reduces to that of a horizontal polarizer (6.19), 
and when 9 = n/2, it reduces to that of a vertical polarizer (6.20). 



6.6 Jones Matrices for Wave Plates 

We next consider wave plates (or retarders), which are usually made from birefrin- 
gent crystals. The index of refraction in the crystal depends on the orientation of 
the electric field polarization. A wave plate has the appearance of a thin window 
through which the light passes. The crystal is cut such that the wave plate has 
a fast and a slow axis, which are 90° apart in the plane of the window. If the 
light is polarized along the fast axis, it experiences index «f ast . The orthogonal 
polarization component experiences higher index n s i ow - 
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When a plane wave passes through a wave plate, the component of the electric 
field oriented along the fast axis travels faster than its orthogonal counterpart, 
which introduces a relative phase between the two polarization components. 
As light passes through a wave plate of thickness d, the phase difference that 
accumulates between the fast and the slow polarization components is 



2nd 

""slow" — K-fast" — - (^slow — ^fast) 



(6.33) 



Slow axis ■ [ 




Waveplate 



Transmitted polarization 
components have altered 
relative phase 

Figure 6.7 Wave plate interacting 
with a plane wave. 



quarter-wave plate Jones matrix 



By adjusting the thickness of the wave plate, one can introduce any desired phase 
difference. 

The most common types of wave plates are the quarter-wave plate and the 
half-wave plate. The quarter-wave plate introduces a phase difference of 



^siow^ ~~ ^fast^ = nl2 + 2nm 



(quarter-wave plate) (6.34) 



between the two polarization components, where m is an integer. This means 
that the polarization component along the slow axis is delayed spatially by a 
quarter wavelength (or five quarters, etc.). 

The half-wave plate introduces a phase difference of 



fcsiow^ ~~ fcfast^ = n + 2nm 



(half-wave plate) (6.35) 



where m is an integer. This means that the polarization component along the 
slow axis is delayed spatially by a half wavelength (or three halves, etc.) . When 
m = in either (6.34) or (6.35), the wave plate is said to be zero order. 

The derivation of the Jones matrix for the two wave plates is essentially the 
same as the derivation for the polarizer in the previous section. Let ei correspond 
to the fast axis, and let 62 correspond to the slow axis, as illustrated in Fig. 6.7. We 
proceed as before. However, instead of setting £ equal to zero in (6.31), we must 
choose values for £ appropriate for each wave plate. Since nothing is absorbed, 
f should have a magnitude equal to one. The important feature is the phase of 
<f. As seen in (6.33) , the field component along the slow axis accumulates excess 
phase relative to the component along the fast axis, and we let f account for this. 
In the case of the quarter-wave plate, the appropriate factor from (6.34) is 



$ = e inl2 = i 



(quarter-wave plate) (6.36) 



This describes a relative phase delay for the light emerging with polarization along 
the slow axis. Substituting (6.36) into (6.30) yields the Jones matrix for a quarter 
wave plate: 



cos 2 9 + i sin 2 9 sin 9 cos 9 - i sin 9 cos 6 
sin cos - z sin (? cost? sin 2 t? + z'cos 2 



(6.37) 



For the half-wave plate, the appropriate factor applied to the slow axis is 



f = e** = -l 



(half-wave plate) (6.38) 
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and the Jones matrix becomes: 



cos 2 6- sin 2 9 2sin0cos0 
2sin0cos0 sin 2 6- cos 2 6 



cos 26 sin20 
sin20 -cos20 



(6.39) half-wave plate Jones matrix 



Remember that 6 refers to the angle that the fast axis makes with respect to the 
x-axis. 

Before moving on, consider the following two examples that illustrate how 
wave plates are often used: 



Example 6.2 

Calculate the Jones matrix for a quarter- wave plate at 9 ■ 
effect on horizontally polarized light. 



45°, and determine its 



Solution: At 9 — 45°, the Jones matrix for the quarter- wave plate (6.37) reduces to 

„in/4 



V2 



1 



1 



(quarter-wave plate, fast axis at 9 = 45°) 



(6.40) 



1 


1 -i 1 




1 


1 


i 




-i 1 









—i 



The overall phase factor e l7lli in front is not important since it merely accompanies 
the overall phase of the beam, which can be adjusted arbitrarily by moving the 
light source forwards or backwards through a fraction of a wavelength. 

Now we calculate the effect of the quarter-wave plates (oriented at 9 = 45°) operat- 
ing on horizontally polarized light: 



(6.41) 



The previous example shows that a quarter- wave plate (properly oriented) can 
turn linearly polarized light into right-circularly polarized light (see Table 6.1). 
On the other hand, as seen in the next example, a half-wave plate can rotate the 
polarization angle of linearly polarized light by varying degrees while preserving 
the linear polarization. 



Example 6.3 

Calculate the effect of a half wave plate at an arbitrary 9 on horizontally polarized 
light. 




Figure 6.8 Animation showing 
effects of polarizers and wave 
plates on polarized light. 



Solution: Carrying out the multiplication, we obtain 



cos 29 


sin 29 




1 




cos 29 


sin20 


-cos 29 









sin 29 



(6.42) 



The resulting Jones vector describes linearly polarized light an angle of a 
the x-axis (see Table 6.1). 



= 29 from 
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6.7 Polarization Effects of Reflection and Transmission 

















x-axis 

directed into page 


h 









Figure 6.9 Incident, reflected and 
transmitted plane waves, each 
propagating along the z-axis of its 
own reference frame. 



When light encounters a material interface, the amount of reflected and trans- 
mitted light depends on the polarization. The Fresnel coefficients (3.18)-(3.21) 
dictate how much of each polarization is reflected and how much is transmitted. 
In addition, the Fresnel coefficients keep track of phases intrinsic in the reflec- 
tion phenomenon. This is true also for reflections from multilayer coatings with 
effective Fresnel coefficients (4.60), (4.61), (4.65) and (4.66). 

To the extent that the s and p components of the field behave differently, 
the overall polarization state is altered. For example, a linearly-polarized field 
upon reflection can become elliptically polarized (see L 6.9). Even when a wave 
reflects at normal incidence so that the 5 and p components are indistinguishable, 
right- circular polarized light becomes left-circular polarized. This is the same 
effect that causes a right-handed person to appear left-handed when viewed in a 
mirror. 

We can use Jones calculus to keep track of how reflection and transmission 
influences polarization. However, before proceeding, we emphasize that in this 
context we do not strictly adhere to a single coordinate system as we did in 
chapter 3, for example in Fig. 3.1. Instead, we consider each plane wave, whether 
incident, reflected or transmitted, to propagate in the z-direction of its own frame, 
regardless of the relative angles between the incident and reflected wave. This 
loose manner of defining coordinate systems, depicted in Fig. 6.9, has a great 
advantage. The x and y dimensions in each individual frame is aligned parallel 
to their respective s and p field component. We will adopt the convention that 
p-polarized light in all cases is associated with the x-dimension (horizontal, say). 
The s-polarized component then lies along the y-dimension (vertical). 

We are now in a position to see why there is a handedness inversion upon 
reflection from a mirror. Notice in Fig. 6.9 that for the incident light, the s- 
component of the field crossed into the p-component of the field yields a vector 
pointing along beam's propagation direction. However, for the reflected light, 
the 5-component crossed into the p-component points opposite to that beam's 
propagation direction. 

The Jones matrix corresponding to reflection from a surface is 







(Jones matrix for reflection) (6.43) 



By convention, we place the minus sign on the coefficient r p to take care of 
handedness inversion. We could put the minus sign on r s instead; the important 
point is that the two polarizations acquire a relative phase differential of n when 
the propagation direction flips. 3 

The Fresnel coefficients specify the ratios of the exiting fields to the incident 
ones. When (6.43) operates on an arbitrary Jones vector such as (6.11), -r p 



The minus sign is needed for our specific convention of field directions, as drawn in Fig. 6.9. In 
our convention, r s and rp are identical at normal incidence. 
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multiplies the horizontal component of the field, and r s multiplies the vertical 
component of the field. Especially in the case of reflection from an absorbing 
surface such as a metal, the phases of the two polarization components can vary 
markedly (see P6.ll). Thus, linearly polarized light containing both 5- and p- 
components in general becomes elliptically polarized when reflected from such a 
surface. When light undergoes total internal reflection, again the phases of the s- 
and p-components differ markedly, which can cause linearly polarized light to 
become elliptically polarized (see P6.12). 

Transmission through a material interface can also influence the polarization 
of the field, although typically to a lesser degree. However, there is no handedness 
inversion, since the light continues on in a forward sense. The Jones matrix for 
transmission is 











(Jones matrix for transmission) (6.44) 



If a beam of light encounters a series of mirrors, the final polarization is 
determined by multiplying the sequence of appropriate Jones matrices (6.43) 
onto the initial polarization. This procedure is straightforward if the normals 
to all of the mirrors lie in a single plane (say parallel to the surface of an optical 
bench). However, if the beam path deviates from this plane (due to vertical 
tilt on the mirrors), then we must reorient our coordinate system before each 
mirror to have a new 'horizontal' (p-polarized dimension) and the new 'vertical' 
(s-polarized dimension). Earlier in this chapter we performed a rotation of a 
coordinate system through an angle 0, described in (6.27), which is also useful 
here. The rotation can be accomplished by multiplying the following matrix onto 
the incident Jones vector: 



cos0 sin0 
-sin0 cos 9 



(rotation of coordinates through an angle 6) (6.45) 



This is understood as a rotation about the z-axis. The angle of rotation 6 is 
chosen such that the rotated x-axis lies in the plane of incidence for the mirror. 
When such a reorientation of coordinates is necessary, the two orthogonal field 
components in the initial coordinate system are stirred together to form the field 
components in the new system. This does not change the intrinsic characteristics 
of the polarization, just its representation. 



rotated x-axis 
(in the plane of incidence) 

original 
y-axis 





Figure 6.10 If the plane of inci- 
dence does not coincide for suc- 
cessive elements in an optical 
system, a rotation matrix must be 
applied to rotate the x-axis to the 
plane of incidence before comput- 
ing the effect of each element. 



Appendix 6.A Ellipsometry 

Measuring the polarization of light reflected from a surface can yield information 
regarding the optical constants of that surface (i.e. n and k). As done in L 6.9, it is 
possible to characterize the polarization of a beam of light using a quarter- wave 
plate and a polarizer. However, we often want to know n and k at a range of 
frequencies, and this would require a different quarter- wave plate thickness d 
for each wavelength used (see (6.34)). Therefore, many commercial ellipsometers 
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do not try to extract the helicity of the light, but only the ellipticity In this case 
only polarizers are needed, which can be made to work over a wide range of 
wavelengths. If, in addition, a variety of incident angles are measured, it is possible 
to extract detailed information about the optical constants n and k and the 
thicknesses of possibly many layers of materials influencing the reflection. 

Commercial ellipsometers 4 typically employ two polarizers, one before and 
one after the sample, where s and p-polarized reflections take place. The first 
polarizer ensures that linearly polarized light arrives at the test surface (polarized 
at angle a to give both s and p-components) . The Jones matrix for the test surface 
reflection is given by (6.43) , and the Jones matrix for the analyzing polarizer 
oriented at angle 9 is given by (6.32). The Jones vector for the light arriving at the 
detector is then 



(6.46) 



cos 2 


sin0cos0 




-r p 




cos a 


sin 9 cos 9 


sin 2 




r s 




sin a 



-r p cosacos 2 + r s sinasin0cos0 
- r p cos a sin 9 cos 9 + r s sin a sin 2 9 

and the intensity arriving to the detector is 

Ioc |-r p cosacos 2 + r s sinacost?sint?| 2 + |-r p cosacosc?sin0 + r,;Sinasin 2 0| 2 



= r. 



cos acos 6 + \r s \ sin asin 9 



( r P r s 



sin2asin2f? 



(6.47) 

For ellipsometry measurements, it is customary to express the ratio of Fresnel 
coefficients as 

r p /r s = tanW A (6.48) 
In this case, the intensity may be shown to be proportional to (see problem P6.13) 

/ oc 1 - 7] sin 20 + Z cos 26 (6.49) 

where 

tan *P cos A tan a tan 2 *P - tan 2 a 

V = 2 ~. — T^r— — o — and ^ = : — — — (6.50) 



tan 2 W + tan 2 a tan 2 *P + tan 2 a 

In commercial ellipsometers, the angle 9 of the analyzing polarizer often rotates at 
a high speed, and the time dependence of the light reaching a detector is analyzed. 
From this type of measurement, the coefficients r\ and £, can be extracted with 
high precision. Then equations (6.50) can be inverted (see problem P6.13) to 
reveal 



tanY 



1 + * 



tan a I and cos A = 



:sign(a) 



(6.51) 



From a series of these types of measurements, it is possible to extract the values 
of n and k for materials from the expressions for r s and r p (with the aid of a 
computer!). A more extensive series of such measurements are needed in the case 
of multilayers involving multiple layers with varying thicknesses. 



See Spectroscopic Ellipsometry Tutorial at J. A. Woollam Co. 



6.B Partially Polarized Light 



159 




Appendix 6.B Partially Polarized Light 

We outline here an approach for dealing with partially polarized light, which is a 
mixture of polarized and unpolarized light. Most natural light such as sunshine is 
unpolarized. The transverse electric field direction in natural light varies rapidly 
(and quasi randomly) . Such variations imply the superposition of multiple fre- 
quencies as opposed to the single frequency assumed in the formulation of Jones 
calculus earlier in this chapter. Unpolarized light can become partially polarized 
when it, for example, reflects from a surface at oblique incidence, since 5 and p 
components of the polarization might reflect with differing strength. 

Stokes vectors are used to keep track of the partial polarization (and atten- 
uation) of a light beam as the light progresses through an optical system. 5 In 
contrast, Jones vectors are designed for pure polarization states. We can consider 
any light beam as an intensity sum of completely unpolarized light and perfectly 
polarized light: 

/=/pol + /un (6.52) 

It is assumed that both types of light propagate in the same direction. 

The main characteristic of unpolarized light is that it cannot be extinguished 
by a single polarizer (even in combination with a wave plate) . Moreover, the 
transmission of unpolarized light through an ideal polarizer is always 50%. On the 
other hand, polarized light (be it linearly, circularly, or elliptically polarized) can 
always be represented by a Jones vector, and it is always possible to extinguish 
polarized light with a wave plate and a single polarizer. 

We may introduce the degree of polarization as the fraction of the intensity 
that is in a definite polarization state: 



^pol Ai: 



(6.53) 



Sir George Gabriel Stokes (1819-1903, 
Irish) was born in Skreen, Ireland. He 
entered Cambridge University at age 18 
and graduated four years later with the 
distinction of senior wrangler. In 1849, 
he became a professor of mathematics 
at Cambridge where he later worked 
with James Clerk Maxwell and Lord 
Kelvin to form the Cambridge School 
of Mathematical Physics. Stokes was a 
powerful mathematician as well as good 
experimentalist, often testing his the- 
oretical solutions in the laboratory. In 
addition to his contributions to optics, 
Stokes made important contributions to 
fluid dynamics (e.g. the Navier-Stokes 
equations) and to mathematical physics; 
Stokes' theorem is employed several 
places in this in this book.(Wikipedia) 



The degree of polarization takes on values between zero and one. Thus, if the light 
is completely unpolarized (such that / pol = 0), the degree of polarization is zero, 
and if the beam is fully polarized (such that / un = 0), the degree of polarization is 
one. 

A Stokes vector, which characterizes a partially polarized beam, is written as 



So 
Si 

S 2 
S 3 



The parameter 



S = 



(6.54) 



5 E. Hecht, Optics, 3rd ed., Sect. 8.12.1 (Massachusetts: Addison-Wesley, 1998). 
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is a comparison of the beam's intensity (or power) to a benchmark or 'input' 
intensity, l m , measured before the beam enters the optical system under consid- 
eration. / represents the intensity at the point of investigation, where one wishes 
to characterize the beam. Thus, the value So = 1 represents the input intensity, 
and So can drop to values less than one, to account for attenuation of light by 
polarizers in the system. (So could increase in the atypical case of amplification.) 

The next parameter, S\ , describes how much the light looks either horizontally 
or vertically polarized, and it is defined as 

S lS ^-So (6.55) 

Here, / hor represents the amount of light detected if an ideal linear polarizer is 
placed with its axis aligned horizontally directly in front of the detector (inserted 
where the light is characterized). Si ranges between negative one and one, taking 
on its extremes when the light is linearly polarized either horizontally or vertically, 
respectively. If the light has been attenuated, it may still be perfectly horizontally 
polarized even if Si has a magnitude less than one. (Alternatively, you might wish 
to examine Si /So, which is guaranteed to range between negative one and one.) 

The parameter S2 describes how much the light looks linearly polarized along 
the diagonals. It is given by 

2/450 

5 2 = — — - So (6.56) 

^in 

Similar to the previous case, 145° represents the amount of light detected if an 
ideal linear polarizer is placed with its axis at 45° directly in front of the detector 
(inserted where the light is characterized). As before, S2 ranges between negative 
one and one, taking on extremes when the light is linearly polarized either at 45° 
or 135°. 

Finally, S3 characterizes the extent to which the beam is either right or left 
circularly polarized: 

5 3 = ^-S (6.57) 

^in 

Here, I T . ciI represents the amount of light detected if an ideal right- circular po- 
larizer is placed directly in front of the detector. A right- circular polarizer is 
one that passes right-handed polarized light, but blocks left handed polarized 
light. One way to construct such a polarizer is a quarter wave plate, followed 
by a linear polarizer with the transmission axis aligned 45° from the wave-plate 
fast axis, followed by another quarter wave plate at -45° from the polarizer (see 
P6.14). 6 Again, this parameter ranges between negative one and one, taking on 
the extremes for right and left circular polarization, respectively. 

Importantly, if any of the parameters Si, S2, or S3 take on their extreme values 
(i.e. a magnitude equal to So), the other two parameters necessarily equal zero. As 
an example, if a beam is linearly horizontally polarized with / = I in , then we have 

6 The final quarter wave plate is to put the light back into the original circular state - not needed 
to measure the Stokes parameter. 
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4or = 4i> ^45° = 4i/2, and I r . cix = I m l2. This yields So = \,S\ = 1, S2 = 0, and S3 = 0. 
As a second example, suppose that the light has been attenuated to I = I m /3 but is 
purely left circularly polarized. Then we have 7 h or = 4i/6, ^45° = I in /6, and 7 r . cir = 0. 
Whereas the Stokes parameters are So = 1/3, Si = 0, S2 = 0, and S3 = -1/3. 

Another interesting case is completely unpolarized light, which transmits 50% 
through all of the polarizers discussed above. In this case, I hoi = /450 = / r . cir = 1/2 
and Si = S2 = S3 = 0. 



Example 6.4 

Find the Stokes parameters for perfectly polarized light, represented by an arbitrary 
Jones vector [ g ] where A and B are complex. 7 Depending on the values A and B, 
the polarization can follow any ellipse. 

Solution: The input intensity of this polarized beam is I in = I pol = \ A\ 2 + \B\ 2 , ac- 
cording to Eq. (6.23), where we absorb the factor \e a c\E e ^\ 2 into \A\ 2 and \B\ 2 
for convenience. The Jones vector for the light that passes through a horizontal 
polarizer is 



1 







A 




A 










B 








1 


1 1 




A 


A + B 


1 


2 


1 1 




B 


2 


1 



which gives a measured intensity of 4 0r = l^l 2 - Similarly, the Jones vector when 
the beam is passed through a polarizer oriented at 45° is 



leading to an intensity of 



745= — 



Finally, the Jones vector for light passing through a right- circular polarizer (see 
P6.14) is 



\A+B\ 2 \A\ 2 + \B\ 2 + A*B + AB* 



1 


1 i 




A 


A+ iB 


1 


2 


-i 1 




B 


2 


-z 



giving an intensity of 



\A+iB\ 2 \A\ 2 + \B\ 2 + i{A*B-AB*) 



Thus, the Stokes parameters become 



Si 



c l^l 2 + |B| 2 , 

6 - : - 1 



2\A\ 2 \A\ 2 + \B\ l \A\ 2 -\B\ 2 



7 We will find it easier in this appendix to write 
difference between B and A. 



A 


instead of 


|A| 


B 




\B\e iS 



where S is the phase 
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Sz- 



\A\ 2 + \B\ 2 + A*B + AB* \A\ 2 + \B\ 2 A* B + AB* 



\A\ 2 + \B\ 2 + i{A*B-AB*) \A\ 2 + \B\ 2 (A*B- AB*) 
S 3 = : : = i : 



It is clear from the linear dependence of So, Si, S2, and S3 on intensity (see 
Eqs. (6.54)-(6.57)) that the overall Stokes vector may be regarded as the sum of 
the individual Stokes vectors for polarized and unpolarized light. That is, we may 



write S 



_ r,(pol) 



SJ u " + Sj n, ,j = 0,1,2,3 



This is certainly true for 



(6.58) 



and in the other cases the unpolarized portion of the light does not contribute to 
the Stokes parameters. Half of the unpolarized light survives any of the test filters, 
which cancels neatly with the unpolarized portion of So in Eqs. (6.55)-(6.57). 

With the aid of the results in Example 6.4, a completely general form of the 
Stokes vector may then be written as 



(6.59) 



' So 




1 pol Ain 


Si 


1 


\A\ 2 -\B\ 2 


s 2 


'hi 


A*B + AB* 


S3 




i {A* B- AB*) 



where the Jones vector for the polarized portion of the light is 



A 
B 



and the intensity of the polarized portion of the light is 



/ pol =|A| 2 + |B| 2 



(6.60) 



Again, we have hidden the factor ^e c |£" e ffl 2 for the polarized portion of the light 
inside \A\ 2 and \B\ 2 . 

We would like to express the degree of polarization in terms of the Stokes 



parameters. We first note that the quantity J S 2 + S2 + S3 can be expressed as 



S 2 + S\ + S3 - 



(\A\ 2 -\B\ 2 \ 2 ({A*B + AB*)\ 2 ( i (A*B- AB*)\ 2 



|A| 2 + |B| 



(6.61) 



'pol 
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Substituting (6.58) and (6.61) into the expression for the degree of polarization 
(6.53) yields 

frol = — Jsj + Sj + Sl (6.62) 
jo v 

If the light is polarized such that it perfectly transmits through or is perfectly 
extinguished by one of the three test polarizers associated with S\, S2, or S3, then 
the degree of polarization will be unity. Obviously, it is possible to have pure 
polarization states that are not aligned with the axes of any one of these test 
polarizers. In this situation, the degree of polarization is still one, although the 
values S\, S2, and S3 may all three contribute to (6.62). 

Finally, it is possible to represent polarizing devices as matrices that operate 
on the Stokes vectors in much the same way that Jones matrices operate on 
Jones vectors. Since Stokes vectors are four-dimensional, the matrices used are 
four-by-four. These are known as Mueller matrices. 8 

Derivation: Mueller Matrix for a Linear Polarizer 

We know that the 50% of the unpolarized light transmits through a polarizer, 
ending up as polarized light with Jones vector 



f Ai 


1 -*un 


COS0 






sinf3 



(see table 6.1). As usual, let 8 give the angle of the transmission axis relative to the 
horizontal. The Jones matrix (6.23) acts on the polarized portion of the light as 
follows 



f 4 




I B' 2 





cos u 
cos 8 sin t 



cosf sin t 
0,^2 a 



A 
B 



= [Acos6 + Bsin6] 



cost 
sinf 



One might be tempted to add I "} I and I 3 I , but this would be wrong, since 



^] and (l 

the two beams are not coherent. As mentioned previously, unpolarized light 
necessarily contains multiple frequencies, and so the fields from the polarized and 
unpolarized beams destructively interfere as often as they constructively interfere. 
In this case, we simply add intensities rather than fields. That is, we have 



|A'| 2 = |4| 2 +|4| 2 = 



+ \Acosd + BsmBr 



■ + I A\ z cos 2 B + \B | 2 sin 2 6 + [A* B + AB * ) sin B cos B 



cos 2 8 



So cos 28 sin2f3 

— + Si + S 2 

2 2 2 



cos 2 



Similarly, 



|E'| 2 = |£;| 2 +|4| 2 = / in 



Sn cos 28 sin20 

— + Si + S 2 

2 2 2 



sin 2 8 



Hans Mueller (Swiss) was a shepherd 
until his late teens. As a physics pro- 
fessor at MIT, he built on the work of 
Stokes and in 1943 formulated a ma- 
trix method for manipulating Stokes 
vectors. He was an engaging lecturer 
into the 1950s and was known for his 
exciting demonstrations. He was a stu- 
dent of Arnold Sommerfeld, and did 
seminal work on ferroelectricity (he is 
reported to have coined the term). See 
Laszlo Tisza,"Adventures of a Theoret- 
ical Physicist, Part II: America," Phys. 
Perspect. 11, 120-168 (2009). 



3 E. Hecht, Optics, 3rd ed., Sect. 8.12.3 (Massachusetts: Addison-Wesley, 1998). 



164 



Chapter 6 Polarization of Light 



Since the light has gone through a linear polarizer, we are guaranteed that A' and 
B' have the same phase. Therefore, A'*B' = A'B'* -\A'\\B'\. In view of (6.59), these 
results lead to 



S,' 



LA + .B S cos20 sin20 
J— 1 ! — L = — + Si + s 2 

L„ 2 2 2 



lA'r-iB'i 



Sn cos 20 sin 20 

— + Si + s 2 

2 2 2 



'■o) 



s' = 



cos 26 cos 2 26 sin 40 

S + Sj + S 2 

2 2 4 

\A'\ \B'\ + \A'\ \B'\ 



= 2 



So cos 26 sin 26 

— + Si + S 2 

2 2 2 



sin 28 sin 46 sin 2 28 
-S + ——S 1 +——S 2 



S' = 



2 4 

U'IIb'I-U'IIb' 1 



These transformations expressed in matrix format become 



f s ° i 




1 


cos 26 


S'i 


1 


cos 20 


cos 2 28 


s' 2 


~ 2 


sin 20 


\ sin 40 


I s> \ 













So 




Si 




s 2 




. s 3 . 



which reveals the Mueller matrix for a linear polarizer. 



The Mueller matrix for a half wave plate is worked out below. The Mueller 
matrix for a quarter wave plate is deferred to problem P6.15 

Derivation: Mueller Matrix for a Half Wave Plate 

We know that all of the light transmits through the wave plate. This immediately 
gives 

S' Q = So 

The wave plate does nothing to unpolarized light. On the other hand, the polarized 
portion of the light is influenced by the wave plate as follows (see (6.39)): 



A' 




cos20 


sin 20 




A 




Acos28 + Bsm28 


B' 




sin20 


-cos20 




B 




i4sin20-Bcos20 



As usual, is the angle of the fast axis relative to the horizontal. (As expected, 
U'| 2 + |,B'| 2 = |j4| 2 + |£| 2 ;the intensity of the light is unaltered.) Using (6.59) we get 



Si 



U'l 2 - Ifi'l 2 |>lcos20 + Bsin20| 2 - |,4sin20 - £cos20| 2 



[\A\ 2 - \B\ Z ) cos40 +{A*B + AB*) sin 40 
H n 



= Sicos40 + S 2 sin40 
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\A'\ 2 -\B'\ 2 |,4cos20 + £sin20| 2 -|yisin20-.Bcos20| 2 



^in I i n 

[\A\ 2 - \B\ 2 ) cos40 + (A*B + AB*) sin 40 



H n 



■ Sicos40 + S 2 sin40 



S' = 



A'*B' + A'B'* 



{A* cos20 + B* sin20) (Asin20 - Bcos0) 
h n 

[A cos 20 + B sin20) {A* sin20 - B* cos0) 
I in 

\A\ 2 -\B\ 2 AB* + A*B 

sin 40 ; cos40 = Si sin 40- S 2 cos40 



S' 3 = i 



A'*B'-A'B'* 



I in 



[A* cos20 + B* sin20) (Asin20 - B cos0) 
t 

hn 

. (^cos20 + Bsin20) [A* sin20 - B* cos0) 



A*B-AB* 



-i- 



h n 



h n 
S 3 



These transformations expressed in matrix format become 



f s ° 1 




1 













So 









cos40 


sin 40 







Si 


s' 







sin40 


-cos 40 







s 2 


s' 













-1 




. s 3 



which reveals the Mueller matrix for a half wave plate. 
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Exercises 



Exercises for 6.2 Jones Vectors for Representing Polarization 



P6.1 



P6.2 



L6.3 



Show that (Ax+ Be lS y) • (Ax+ Be lS y)* = 1, as defined in connection 
with (6.5). 

Prove that if < <5 < n, the helicity is left-handed, and if n < S < 2n the 
helicity is right-handed. 

HINT: Write the relevant real field associated with (6.5) 

E(z, t) = \E e ff\ [xAcos(fcz- tot + (f>) + yB cos [kz-to t + (p + 5)\ 

where <p is the phase of £ e ff- Freeze time at, say, t = (plto. Determine the 
field at z = and at z = XI A (a quarter cycle), say. If E(0, f) x E(A/4, t) 
points in the direction of k, then the helicity matches that of a wood 
screw. 

Determine how much right-handed circularly polarized light (A vac = 
633 nm) is delayed (or advanced) with respect to left-handed circularly 
polarized light as it goes through approximately 3 cm of Karo syrup 
(the neck of the bottle) . This phenomenon is called optical activity. 
Because of a definite-handedness to the molecules in the syrup, right- 
and left-handed polarized light experience slightly different refractive 
indices, (video) 



Karo Light 
Corn Syrup 



Screen 



Polarizer 



Polarizer 



Laser 



Figure 6.11 Lab schematic for L 6.3 

HINT: Linearly polarized light contains equal amounts of right and left 
circularly polarized light. Consider 



1 


i — i 


e i<P 


1 


2 


i 


H 

2 


-i 



where (p is the phase delay of the right circular polarization. Show that 
this can be written as 



JS 



cos(/>/2 
sin0/2 
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The overall phase 5 is unimportant. Compare this with 

cos a 
sin a 

where a is the angle of linearly polarized light (see table 6.1). 

Exercises for 6.3 Elliptically Polarized Light 

P6.4 For the following cases, what is the orientation of the major axis, and 
what is the ellipticity of the light? Case I: A = B = 11 'sjl; 6 = Case II: 
A = B = l/y/2; 8 = n/2; Case III: A = B = l/\/2; <5 = nIA. 

Exercises for 6.4 Linear Polarizers and Jones Matrices 

P6.5 (a) Suppose that linearly polarized light is oriented at an angle a with 
respect to the horizontal axis (x-axis) (see table 6.1). What fraction of 
the original intensity gets through a vertically oriented polarizer? 

(b) If the original light is right-circularly polarized, what fraction of the 
original intensity gets through the same polarizer? 

Exercises for 6.5 Jones Matrix for Polarizers at Arbitrary Angles 

P6.6 Horizontally polarized light (a = 0) is sent through two polarizers, the 
first oriented at 6\ = 45° and the second at 62 = 90°. What fraction of 
the original intensity emerges? What is the fraction if the ordering of 
the polarizers is reversed? 

P6.7 (a) Suppose that linearly polarized light is oriented at an angle a with 
respect to the horizontal or x-axis. What fraction of the original inten- 
sity emerges from a polarizer oriented with its transmission at angle 
from the x-axis? 

Answer: cos 2 (8- a); compare with P6.5. 

(b) If the original light is right circularly polarized, what fraction of the 
original intensity emerges from the same polarizer? 

P6.8 Derive (6.12), (6.13), and (6.14). 

HINT: Analyze the Jones vector just as you would analyze light in the 
laboratory. Put a polarizer in the beam and observe the intensity of the 
light as a function of polarizer angle. Compute the intensity via (6.23). 
Then find the polarizer angle (call it a) that gives a maximum (or a 
minimum) of intensity. The angle then corresponds to an axis of the 
ellipse inscribing the E-field as it spirals. When taking the arctangent, 
remember that it is defined only over a range of n. You can add n for 
another valid result (which corresponds to the second ellipse axis) . 
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Exercises for 6.6 Jones Matrices for Wave Plates 

L6.9 Create a source of unknown elliptical polarization by reflecting a lin- 
early polarized laser beam (with both 5 and p-components) from a 
metal mirror with a large incident angle (i.e. 9\ > 80°). Use a quarter- 
wave plate and a polarizer to determine the Jones vector of the reflected 
beam. Find the ellipticity, the helicity (right or left handed), and the 
orientation of the major axis, (video) 



o 

Silver Mirror 
-80 angle of incidence 




Screen 



Figure 6.12 Lab schematic for L 6.9 

HINT: A polarizer alone can reveal the direction of the major and minor 
axes and the ellipticity, but it does not reveal the helicity. Use a quarter- 
wave plate (oriented at a special angle 9) to convert the unknown 
elliptically polarized light into linearly polarized light. A subsequent 
polarizer can then extinguish the light, from which you can determine 
the Jones vector of the light coming through the wave plate. This must 
equal the original (unknown) Jones vector (6.11) operated on by the 
wave plate (6.37). As you solve the matrix equation, it is helpful to note 
that the inverse of (6.37) is its own complex conjugate. 

P6.10 What is the minimum thickness (called zero-order thickness) of a 
quartz plate made to operate as a quarter- wave plate for A vac = 500 nm? 
The indices of refraction are nf ast = 1.54424 and n s \ 0VJ = 1.55335. 




Figure 6.13 Geometry for P6.ll 



Exercises for 6.7 Polarization Effects of Reflection and Transmission 

P6.1 1 Light is linearly polarized at a = 45° with a Jones vector according to 
table 6. 1 . The light is reflected from a vertical silver mirror with angle 
of incidence 9[ = 80°, as described in (P3.15). Find the Jones vector 
representation for the polarization of the reflected light. 

NOTE: The answer may be somewhat different than the result mea- 
sured in L 6.9. For one thing, we have not considered that a silver mirror 
inevitably has a thin oxide layer. 

P6.12 Calculate the angle 9 to cut the glass in a Fresnel rhomb such that after 
the two internal reflections there is a phase difference of n/2 between 
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the two polarization states. The rhomb then acts as a quarter wave 
plate. 

HINT: You need to find the phase difference between (3.40) and (3.41). 
Set the difference equal to nIA for each bounce. The equation you get 
does not have a clean analytic solution, but you can plot it to find a 
numerical solution. 

Answer: There are two angles that work: 6 s 50° and s 53° . 



Exercises for 6 A Ellipsometry 

P6.13 Derive (6.49) and (6.51), often used for ellipsometry measurements. 
HINT: Using sin 2 9 = 1 " c ° s2e and cos 2 6 = 1+c ° s2e , first show 
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Figure 6.14 Fresnel Rhomb geom- 
etry for P6.12 



Exercises for 6.B Partially Polarized Light 



P6.14 



(a) One way to construct a right- circular polarizer is using a quarter 
wave plate with fast axis at 45°, followed by a linear polarizer oriented 
vertically, and finally a quarter wave plate with fast axis at -45°. Calcu- 
late the fones matrix for this system. 

1 i 



Answer: 



(b) Check that the device leaves right- circularly polarized light unal- 
tered while killing left- circularly polarized light. 

P6.15 Derive the Mueller matrix for a quarter wave plate. 



Answer: 





cos 2 20 
\ sin 40 
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Chapter 7 



Superposition of Quasi-Parallel 
Plane Waves 

To this point, we have typically only considered individual plane-wave fields 
which have uniform intensity throughout space and time. Some optical fields 
can be well-approximated by a plane wave, but most have a more complicated 
structure. Nevertheless, it turns out that any field (e.g. pulses or focused beams), 
regardless of how complicated, can be described by a superposition of many 
plane wave fields. In this chapter, we develop the techniques for superimposing 
plane waves. 

We begin our analysis with a discrete sum of plane wave fields and show how 
to calculate the intensity in this case. We will introduce the concept of group 
velocity, which describes the motion of interference 'ripples' resulting when 
multiple plane waves are superimposed. Group velocity is distinct from phase 
velocity that we encountered previously. As we saw in chapter 2, the real part of 
refractive index in certain situations can be less than one, indicating superluminal 
wave crest propagation (i.e. greater than c)\ In this case, the group velocity is 
usually less than c. Since group velocity tracks the speed of the interference 
ripples, regions of light intensity tend to advance with the group velocity rather 
than the phase velocity. 

Beginning in section 7.3, we extend our analysis of wave superposition to 
waveforms composed of continua of plane waves rather than from discrete sums. 
The analysis is based on Fourier theory (see section 0.4 for a review), which 
in essence is a tool for keeping track of the plane waves that make up a given 
waveform E (r i , t) . Since it is easiest to deal with plane waves, we will learn how 
to decompose arbitrary waveforms into plane waves for purposes of determining 
effects such as propagation in a material (with a frequency-dependent index). 
Conversely, we will also learn how to reassemble plane waves into a final pulse at 
the end of propagation. 

Different frequency components of the waveform experience different phase 
velocities, causing the waveform to undergo distortion as it propagates, a phe- 
nomenon called dispersion. We shall see that the group velocity tracks the move- 
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Sir Isaac Newton (1643-1727, En- 
glish) was born in Lincolnshire, England 
three months after the death of his fa- 
ther who was a farmer. Newton spent 
much of his childhood with his ma- 
ternal grandmother, after his mother 
remarried. (Newton did not like his step- 
father.) In his teenage years, Newton's 
mother tried to persuade him to take up 
farming, but his love for education won 
out. He became the top-ranked student 
and was admitted into Trinity College, 
Cambridge at age 18. Newton was in- 
fluenced by the works of Descartes, 
Copernicus, Galileo, and Kepler. Upon 
graduation four years later, the univer- 
sity closed for two years because of a 
plague. Newton's return to farm life co- 
incided with a remarkable period when 
he first developed ideas on calculus, 
gravitation, and optics. Newton later 
returned to Cambridge where he spent 
his extraordinarily prolific career and 
became the first scientist to be knighted. 
In optics, Newton advanced the ray the- 
ory of light and image formation. He 
showed that 'white' light is comprised of 
many colors and that the amount of re- 
fraction depends on color. He built the 
first reflecting telescope, which avoids 
chromatic aberration. Newton advo- 
cated against the wave theory of light in 
favor of his 'corpuscular' theory. (Imag- 
ining that by this Newton foresaw the 
quantized nature of light energy gives 
too much credit!) (Wikipedia) 



ment of the center of the wave packet. For narrowband packets (i.e. packets 
comprised of a narrow range of frequencies and hence long duration), the packet 
tends to maintain its shape (with some spreading) while propagating at the group 
velocity. On the other hand, broadband pulses (i.e. packets comprised of many 
frequencies and possibly of short duration) tend to distort severely while prop- 
agating in materials. Nevertheless, the group velocity tracks the center of the 
pulse. 

It turns out that group velocity can become superluminal when significant 
absorption and/or amplification of the light pulse is involved. This is no cause 
for alarm (nor is it cause for an abundance of gee-wiz papers on the subject). 
Absorption and amplification can cause a pulse to appear to move unexpectedly 
fast through a reshaping effect. Group velocity, or rather its inverse group delay, 
takes this into account, which makes it remarkably general. In such a scenario, 
energy can be lost from the back of a pulse or perhaps added to an already-present 
forward portion of a pulse such that the average pulse position appears to advance 
abruptly. When all energy is accounted for (both the energy in the medium and in 
the light pulse), however, nothing advances faster than the universal speed limit 
c. Appendix 7.B gives a good look under the hood at how a medium exchanges 
energy with a pulse to produce these eye-catching effects. 

7. 1 Intensity of Superimposed Plane Waves 

We can construct arbitrary waveforms by adding together many plane waves with 
different propagation directions, amplitudes, phases, frequencies and polariza- 
tions. Consider the following discrete sum of plane waves: 



E(r,f) = £E ie <'(V"--<V) 



The corresponding magnetic field according to (2.56) is 

B(r, t) = YBje^r^jt) = y k i xl -.i ( ,i(k -v ,■> ,) 



(7.1) 



(7.2) 



As usual, the (time- and space-independent) individual field components E; 
contain both amplitude and phase information for each plane wave. 
The Poynting vector (2.52) associated with the fields (7.1) and (7.2) is 



S(r, f) = Re{E (r, f)} x 
1 



Re {B (r, t)} 
Ho 



; >> mJ U 1 > { 1 >> 



(7.3) 



where we have assumed that the k m vectors are real. (Recall the conspiracy that 
only the real parts of the fields are relevant - crucial when multiplying.) The above 
expression is cumbersome because of the many cross terms that arise when 
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the two summations are multiplied. We need some simplifying assumptions 
before we can make any real progress on this expression. For example, we can 
time-average the rapid fluctuations in the expression that vary on the scale of 
optical frequencies. Additionally, it is common to encounter the situation where 
all plane-wave components travel roughly parallel to each other, which will be a 
big help in simplifying (7.3). 

Intensity for Quasi Parallel-Traveling Light 

For simplicity, we assume that all vectors kj are real. If the wave vectors are 
complex, the result is essentially the same, but, as in (2.62), the field amplitudes Ej 
correspond to local amplitudes (adjusted for absorption or amplification during 
prior propagation). We apply the BAC-CAB rule (P0.3) to (7.3) and obtain 

S(r,f)= £ — *— [k m (Re{E ; -e i '( k ^ r - (u ^)}-Re{E m e i(k '"-"- w "' t) [) 

j,m w '«Mo ' ' ' (7 4) 

- RejEme''*-"""-""^} fatelEje^r*-"] 1 ) J-kJ] 

The last term in (7.4) can be dismissed if all k- vectors are approximately parallel to 
each other, in which case all of the k m are essentially perpendicular to each of the 
Ej. We will make this rather stringent assumption and kill the last line in (7.4). The 
magnitude of the Poynting vector then becomes (with the help of (0.30)) 



S(r,f) = £ 1 



E ; -e I '( k J" r " (0 ' f ' + E* e"'( k J' r " 



_ v- W IE" E C '[fcj+fcm)-r-fta/+a)mUl + E*.E* e - ! '[( k / +k <n)' r -K- +( < , m) i '] 
f^AcOm^y 1 J m 

+ Ej .E* n e ! 'l( k ;'" km '' r "K- w '*) t l +E* •E m e" i f( k J" km ^ r "( ftJ J" w " ! ^l| 

(7.5) 

The terms involving [wj + w m ) t oscillate rapidly and time-average to zero. By 
comparison, the terms involving {wj - a> m ) t oscillate slowly (especially when the 
o)j are all in the neighborhood of the co m ) or not at all when j = m. We retain the 
slower fluctuations and discard the rapid oscillations. For purposes of computing 
the intensity (as opposed to determining phase changes with propagation) we can 
approximate the index as a constant, and write k m /{a> m [i ) =; ne a c. (We seldom 
measure intensity inside of materials anyway.) With these simplifications, (7.5) 
becomes 

F • - F* o ! '[( k j- k m)- I '-( w /- fcl m)t] + F* . F «- ! '[( k /- k m)-r-{D«-Wm)t] 

ne a c ^ C J E m K E m e 



£ e i(k m -r-iv m t) + g* e -i'(k m -r-w m q 



2 • 2 



^^Re{E(r, t)-E*{r,t)}. 



(parallel k- vectors) 



(parallel k-vectors) (7.6) 
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(valid for parallel or antiparallel 
k- vectors and constant n) 




Figure 7.1 Animation showing su- 
perposition of two plane waves 
(electric fields) with different fre- 
quencies and traveling at different 
speeds. 



The final expression in (7.6) is already manifestly real so there is no need 
to apply the operation Re {}. The time-averaged intensity for light composed of 
parallel wave vectors is then well-approximated by 



Hr, t) 



ne n c 



E(r, f)-E*(r,f) 



(7.7) 



In a surprising turn of events, it is important that E(r, f) in (7.7) be written as the 
entire complex expression for the electric field rather than just the real part. Then 
(7.7) automatically time- averages over rapid oscillations in such a way that I{r, t) 
retains a slowly varyingtime dependence. This expression is reminiscent of (2.62), 
but it should be kept in mind that we previously considered only a single plane 
wave (perhaps with two distinct polarization components). 

If some of the k-vectors point in an anti-parallel direction, we can still use 
(7.7) with negative signs entered explicitly for those components. This brings up 
a distinction between irradiance S and intensity I. For example, (S) is zero for 
standing waves because there is no net flow of energy, whereas (7.7) still gives a 
result. Intensity specifies whether atoms locally experience an oscillating electric 
field without regard for whether there is a net flow of energy carried by a light field. 
At extreme intensities, however, when the influence of the magnetic field becomes 
comparable to that of the electric field, the distinction between propagating and 
standing fields becomes important to the behavior of charged particles in that 
field. 

The assumption that all vectors kj are parallel is not as serious as it might 
seem at first. For example, the output of a Michelson interferometer (studied 
in chapter 8) is the superposition of two fields, each composed of a range of 
frequencies with parallel kj 's. We can relax the restriction of parallel kj 's slightly 
and apply (7.7) also to plane waves with nearly parallel k,'s such as occurs in a 
Young's two-slit diffraction experiment (studied in chapter 8). In such diffraction 
problems, (7.7) is viewed as an approximation valid to the extent that the vectors 
kj are close to parallel. For the remainder of the chapter we will assume that the 
k-vectors for all frequency components in our waveform are essentially parallel. 

7.2 Group vs. Phase Velocity: Sum of Two Plane Waves 

To begin our study of interference, consider just two plane waves with equal 
amplitudes given by 



Ei = E e 



i(k l -r-d) l t) 



and E2 = E„e* ckrr ~ W2fl 



(7.8) 



As we previously studied (see PI. 9), the velocities of the wave crests for these two 
waves are 

Vp\ = (j)\lk\ and 1^2 = ^2/^2 (7.9) 
These are known as the phase velocities of the individual plane waves. 
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Next consider a composite wave created from the superposition of the above 
two plane waves: 

E(r, f) = E e'' (kl - r - Wlf) +E e ! ' (k2 - r -^ r) (7.10) 

The two plane waves interfere, producing regions of higher and lower intensity 
that move in time. Remarkably, these intensity peaks can propagate at speeds 
quite different from either of the phase velocities in (7.9). The intensity (7.7) for 
the field (7.10) is computed as follows: 



ne c t 



2 

ne c 



iCki-r-Wit) + e Hk 2 -r-(o 2 t) 



-!' (ki -!•-&>! f) - 



E -E 



2 + e ![(k 2 -ki)-r-(w 2 -w 1 )f] + g -<[(te-ki)T-C(»2-<ai)fl 



(7.11) 



where 



= ne cE -E* [1 + cos [0c 2 -ki) r- (o> 2 - Wi) r]] 
= ne cE -Eo [l + cos(Ak-r- Ao)t)] 

Aksk 2 -ki 

AO) = 0)2-0)1 



(7.12) 



The darker line in Fig. 7.2 shows the intensity computed with (7.11). Keep in 
mind that this intensity is averaged over rapid oscillations. For comparison, the 
lighter line shows the Poynting flux with the rapid oscillations retained, according 
to (7.5). It is left as an exercise (see P7.3) to show that the rapid- oscillation peaks 
in Fig. 7.2 move with a phase velocity derived from the average k and average o) 
of the two plane waves: 

ave{o)} 

An examination of the cosine argument in (7.11) reveals that the time-averaged 
curve in Fig. 7.2 (solid) travels with speed 



v e = 



Ao) 
Afc 



(7.14) 



This is known as the group velocity. Essentially, v g may be thought of as the 
velocity for the envelope that encloses the rapid oscillations. 

In general, v g and v p are not the same. This means that as the waveform 
propagates, the rapid oscillations move within the larger modulation pattern, for 
example, continually disappearing at the front and reappearing at the back of 
each modulation. The group velocity is identified with the propagation of overall 
waveforms. The presence of field energy in a waveform is clearly tied more to v g 
than to v p . 



Example 7.1 

Determine the phase velocity and group velocity for the superposition of two plane 
waves in a plasma (see P2.7). 




n/Ak 



Position 



Figure 7.2 Intensity of two inter- 
fering plane waves. The solid line 
shows intensity averaged over 
rapid oscillations. 
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John William Strutt (3rd Baron 
Rayleigh) (1842-1919, British) was 
born in Langford Grove, Essex, England 
and was frequently ill in his youth. He 
entered the University of Cambridge in 
1861 and graduated four years later as 
senior wrangler in mathematics. He mar- 
ried in 1871 and became the father of 
three sons. In 1873, Strutt inherited the 
Barony of Rayleigh (and the title Lord 
Rayleigh) from his father who died that 
year. In 1879 Strutt succeeded James 
Clerk Maxwell as the Cavendish Profes- 
sor of Physics at Cambridge. Rayleigh 
studied a wide variety of subjects. He 
is credited with the discovery of argon. 
He studied how atoms scatter light 
(Rayleigh scattering) and explained why 
the sky is blue. He developed the notion 
of group velocity and used it to under- 
stand the propagation of sound. He 
won the Nobel prize in physics in 1904. 
(Wikipedia) 



Solution: The index of refraction is given by 



'plasma I 



{())) = \ l~wl/w 2 < 1 



(assuming w > w p ) (7.15) 



The phase velocity for each frequency is computed as 

fell + Oil c 

v p = = (7.16) 

{(l)\)0)llC+ «pla S ma(W 2 )W 2 /C H pl 

For convenience, we have taken w\ and a>2 to lie very close to each other. Since 
Wpiasma < 1. both of these velocities exceed c. However, the group velocity is 

Aw 



Ak 



dw 


dk 1 


-l 


d wrc plasma («) 
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dk 


dw 




dw c 





^plasma 



{w)c 



(7.17) 



which is clearly less than c. The derivation of the final expression in (7.17) from 
the previous one is left as an exercise. 

Example 7. 1 illustrates that in an environment where the index of refraction is 
real (i.e. no net exchange of energy with the medium), the group velocity does not 
exceed c, even when the phase velocity does. The 'fast-moving' phase velocity v p 
results merely from an interplay between the field and the plasma. In a similar 
sense, the intersection of an ocean wave with the shoreline can also exceed c, if 
different points on the wave front happen to strike the shore nearly simultane- 
ously. The point of intersection between the wave and the shoreline does not 
constitute an actual object under motion. Similarly, wave crests of individual 
plane waves do not necessarily constitute actual objects that are moving. In short, 
v p is not the relevant speed at which events up stream influence events down 
stream. 

Individual plane waves have infinite length and infinite duration. They do not 
exist in isolation except in our imagination. All real waveforms are comprised of a 
range of frequency components, and so interference always happens. Energy is 
associated with regions of constructive interference between those waves. Group 
velocity v g tracks the presence of field energy, whether that energy propagates or 
is extracted from a medium. Although sometimes v g can exceed c (i.e. when ab- 
sorption or amplification is involved), energy is never transported faster than the 
universal speed limit c. An examination of energy flow is given in Appendix 7.B. 



7.3 Frequency Spectrum of Light 

A waveform constructed from a discrete sum (as in the previous two sections) 
must eventually repeat over and over (i.e. it is periodic) . To create a waveform that 
does not repeat (e.g. a single laser pulse or, technically speaking, any waveform 
that exists in the physical world since no light source repeats forever) we must 
replace the discrete sum (7.1) with an integral that combines a continuum of 
plane waves. Such a waveform at a point r can be expressed as 

oo 

E(r,r) = — [ E(r,ftj)e" iM ^w (7.18) 
V2n J 

-oo 
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The function E (r, oj), called the spectrum, has units of field per frequency. Essen- 
tially, it gives the amplitude and phase of each plane wave that makes up the over- 
all waveform. It includes any spatially dependent factors such as exp{ik(w) -r}. 
We distinguish the spectrum E (r, oj) from the wholly separate function E(r, t) by 
its argument (i.e. oj instead of t). (Sorry for using E for both functions, but this 
is standard notation.) The operation (7.18) is called an inverse Fourier transform 
as outlined in section 0.4; actually, it would be a good idea to review section 0.4 
thoroughly. Now. Why haven't you turned to section 0.4 yet? The factor 1 / \/2n is 
introduced to match our Fourier- transform convention. Regardless of what the 
function is called, please notice that (7.18) merely sums together a range of plane 
waves in much the same way that our earlier discrete summation (7.1) does. 

If we already have/know a waveform E(r, t), one might wonder what plane 
waves should be added together in order to construct it. Equation (7.18) can be 
inverted, which remarkably has a very similar form: 

oo 

E(r,w) = ^ f E(r,t)e i6 "df (7.19) 
\/2n J 

-oo 

This operation is called the Fourier transform. It is used to generate the spectrum 
E (r, oj) from the field E(r, t) in much the same way that (7.18) is used to generate 
the field E(r, t) from the spectrum E (r, oj). 

Although only the real part of E(r, t) is physically relevant, we can continue 
our habit of working with the complex field and taking the real part of E (r, t) at 
our leisure. 1 In fact, we will find it advantageous to work with the complex field 
instead of only the real part. We will not run into trouble as long as we remember 
never to discard the imaginary part of E (r, oj) , only the imaginary part of E (r, t) . 

The intensity formula (7.7) remains useful for continuous superpositions of 
plane waves (i.e. a field defined by the inverse Fourier transform (7.18)): 

/(r,r) = ^E(r,f)-E*(r,f) (7.20) 

Remember, this formula specifically requires the fields to be in complex for- 
mat, and it takes care of the time-average over rapid oscillations automatically. 2 
Moreover, the above expression for / (r, t) assumes that all relevant k-vectors are 
essentially parallel. 

Similarly we will define the power spectrum produced from E (r, oj) , which we 
write as 

7(r,w) = ^E(r,w)-E*(r,w) (7.21) 

1 Since Fourier transforms are linear, one can take the Fourier transform of the real and imaginary 
parts of a field separately. Appropriate modifications to E (r, to) in the frequency domain will not 
cause the two parts to become mingled. Upon taking the inverse Fourier transform to obtain E(r, t) 
again, the original real part remains purely real, and the original imaginary part remains purely 
imaginary. 

2 To use this expression there needs to be a sufficient number of oscillations within the waveform 
to make the rapid time average meaningful. 
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Figure 7.3 Real part of electric 
field (7.23) with T = 4ti/w and 
T = \QtiIwq, where 2ji/ih)q is the 
period of the carrier frequency. 




Figure 7.4 The intensity (7.20) of 
the fields in Fig. 7.3. 



The power spectrum / (r, cj) is what one observes when the waveform is sent into 
a spectral analyzer or spectrometer. We must apologize again for the potentially 
confusing notation (in wide usage) : / (r, oj) is not the Fourier transform of I(r, t) ! 
They are defined exclusively through (7.20) and (7.21). 

Parseval's theorem (see Example 0.7) imposes an interesting connection be- 
tween the time-integral of the intensity and the frequency-integral of the power 
spectrum: 



J I{r,t)dt = J I{r,oj)da) 



(7.22) 



With the above formalities out of the way, we will illustrate the use of Fourier 
transforms through some examples. 

Example 7.2 

Find E (r, w) associated with the field 



-t 2 /2T 2 



(7.23) 



The real part of this field is shown in Fig. 7.3 for two different durations T. The 
intensity profile computed by (7.20) is shown in Fig. 7.4 . 

Solution: The argument r is unimportant to our calculation. It merely specifies 
that we are considering the field at the point r. We compute the Fourier transform 
as follows: 



E(r,w) : 



\/2n J 



E {r)e- t2 / 2T2 e- iwot e iojt dt 



(7.24) 



E (r) 



2 /2T 2 +i(a)-a> a )t^ t 



This integral can be performed with the help of (0.55), and we obtain 



E(r,w) = TE (r)e~ 



T 2 (w-w ) 2 



(7.25) 



Notice that E (r, to) has units of Field multiplied by time, or in other words, field per 
frequency. 



In general, E (r, o») is a complex function. E (r, co) keeps track of the amplitude 
and phase of each plane wave needed to compose the waveform E(r, t). More of- 
ten than not, E (r, a>) exhibits a complicated complex phase structure, depending 
on the time-shape of E(r, t). 

The spectrum of the field in Example 7.2 is shown in Fig. 7.5. The complex 
phase turns out to be boringly uniform for this example; if E is real, the imaginary 
part of the spectrum turns out to be zero for all frequencies. The corresponding 
power spectrum (7.21) is plotted in Fig. 7.6. As expected, the waveform includes 



7.3 Frequency Spectrum of Light 



179 



frequencies in the neighborhood of a) . A range of frequencies are needed to con- 
struct waveforms that turn on and off. The shorter the duration of the waveform, 
the more frequency components that are necessary. This trend can be seen for 
the two pulse durations T plotted. 

Example 7.3 

Check Parseval's theorem for the field and spectrum in Example 7.2. 
Solution: The time integration in (7.22) yields 



f Kx,t)dt= ^E (r)-E*(r) J e" t2 / r2 dt 



2 

ne Q c 



E (r)-E*(r)rv^ 



where we have used (0.55) to perform the integration. This result has units of 
energy per area. It is the energy per area absorbed by a detector after the pulse has 
concluded. The frequency integration in (7.22) yields 



/ 



I(r,(D) dw=^E (r)-E* (r) T 2 



-^EoW-Eq (r)T^ 



which is the same answer. 



TE (r) f ^_tH^o) 2 
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Re{E(w)} 



Im{E(w)} 




Re{E(w)} 



Im{E(w)} 



0) 

Figure 7.5 Spectral components 
(7.25) of the fields in Fig. 7.3 with 
T = AtiIojq and T = IOti/coq, where 
2nlu)a is the period of the carrier 
frequency. 



As mentioned previously, the inverse Fourier transform is interpreted as sum- 
ming together many plane waves to create a waveform. 

Example 7.4 

Take the inverse Fourier transform of (7.25) to recover the original waveform (7.23) . 



Solution: The inverse Fourier transform (7.18) is 

oo 

E(r,f) = -L= [ E{r,ti>)e~ ia>t du) 
V2i J 

D 

OO 

J e '~r^~e~ U0T d(i) (7.26) 

-oo 

OO 




Figure 7.6 Power spectrum based 
on (7.21) for the spectral compo- 
nents shown in Fig. 7.5. 
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Re{E(w)} 



Im{E(w)} 





CO 




Figure 7.7 Spectrum based on 
(7.28) with T = IQn/cjQ. Compare 
with the lower curve in Fig. 7.5 



This integral can be performed with the help of (0.55), which gives 



E(r, t) 



TE (r) 



2n 
■ E (r) e 



T 2 I2 



[T 2 a>o-itf t 2 
, 4(r 2 /2) ; 



■t 2 /2T 2 



Since only the real part of the time profile E(r, t) is physically relevant, you 
might be curious about how the Fourier transform of the real part of the field 
compares with that of the complex version of the field that we have been using. 
Indeed, there are situations where it is more appropriate to use the real version 
of the field rather than its complex form. For example, if a waveform includes 
multiple propagation directions or if a waveform contains only a few cycles, then 
the motivation/interpretation behind (7.20) and the convenience of the complex 
format begins to wane. 

Example 7.5 

Take the Fourier transform of just the real part of waveform (7.23). 

Solution: The real part of (7.23) is 

E(r, f) + E* (r, f) 



E r (r, t): 



_ t2/2T 2 E (r)e-'"° f + E *(r)e 
2 



'M)Qt 



(7.27) 



If E (r) is real, then this field can be written as E„ (r) e cos (<y„ t ) . The Fourier 

transform (7.19) yields (see P0.24) 



E (r) e 



E r (r,(o)= T 



The spectrum is shown in Fig. 7.7. 



+ E*(r) e 



(7.28) 



From the above example, you might notice that the transform of the real 
part of a field tends to be more cumbersome than the transform of the entire 
complex field. For the real field, both positive and negative frequency components 
contribute to the overall spectrum. 3 Moreover, the Fourier transform of a real 
function E r (r, t) obeys the following symmetry relation: 



E r (r,-w) = E* (r,(o) 



(ifE r (r,f)isreal) (7.29) 



The spectrum in Fig. 7.7 obeys this symmetry relation, whereas the Fourier trans- 
form of the complex field depicted in Fig. 7.5 does not. 



3 Essentially, the spectrum of the complex representation of the field can be understood to be 
twice the spectrum of the real representation, but plotted only for the positive frequencies. 
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7.4 Packet Propagation and Group Delay- 
Once we have the spectrum for a waveform (obtained by Fourier transform), 
we can apply effects to the individual spectral components. In particular, we 
can find how an overall waveform propagates in a uniform medium by taking 
advantage of our knowledge of how individual plane waves propagate (as studied 
in chapter 2) . At any point in the medium, we can perform an inverse Fourier 
transform, which recombines spectral components (i.e. plane waves) to reveal 
how the overall waveform looks as a function of time. Thus, we will be able to 
predict the temporal profile of a waveform at any location given knowledge of 
that waveform at another location. 4 

Let E(ro, t) be the temporal profile of a pulse at some point ro in a medium. 
The spectrum of this pulse E(ro,<y) (found using (7.19)) gives the amplitudes and 
phases of the individual plane wave components at the point ro. We already 
know how to propagate individual plane waves through a material (see (2.20)). A 
phase shift associated with a displacement Ar modifies the spectral components 
according to 

E (r + Ar, w) = E (r„,w) e ikm ' Ar (7.30) 

The k-vector contains the frequency-dependent information about the material 
via k - n{a>)a)/c. (A complex wave vector k may also be used if absorption or 
amplification is present.) We take the inverse Fourier transform of E (r + Ar, oj) at 
the new position to determine the waveform E (r + Ar, t): 

oo 

E(r„ + Ar,f) = — = f E(r„ + Ar, uS)e~ iu}t dco 
V2jt J 

-oo 
oo 

= -L [ E{r ,a))e ma)) ^ r - mt) do (7.31) 
V2n J 

-oo 



Example 7.6 If a waveform at r = has the form E(0, t) = E Q e~ til2T ' e~ iMot , 
compute the waveform at r = zz if propagation occurs in vacuum in the 
z-direction. 

Solution: Of course, after traversing Ar = zz in vacuum, the waveform will 
look the same, only arriving a time z/c later. We'll demonstrate that the 
tools described above yield this expected result. The Fourier transform of 
the Gaussian pulse is given in (7.25) : 

E{0,cj) = TE e 2 



4 See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.8 (New York: John Wiley, 1999). 
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-QigY -w ■ r (t-zlc) 1 



To find the field downstream we invoke (7.30) , assuming k (cj) = k vac (oj) z = 
~z, which gives the appropriate phase shift for each plane wave compo- 
nent: 

E (z, (0) = E (0, cd) e mm) hr = TE e 2 — e l t z 
We compute the final waveform using (7.31) and obtain 

1 r t\« 

E(z,t) = -= E Te 

s/2n J 

—00 

(7.32) 

Not surprisingly, after traveling a distance z though vacuum, the pulse looks 
identical to the original pulse, only delayed by time zl c. 

A waveform propagating in a material such as glass can undergo significant 
temporal dispersion, as different frequency components experience different 
indices of refraction. Each frequency component propagates at its own phase 
velocity. The speed of the pulse, however, can be quite different; it propagates 
approximately with the group velocity, as will be shown below. 

The exponent in (7.30) is called the phase delay for the pulse propagation. It 
is often expanded in a Taylor series about the pulse carrier frequency a) : 



k Ar = 



kL + 



dk 

da) 



{0) - 0) ) + 



io 



1 d 2 k 

2 da) 2 



wo 



•Ar 



(7.33) 



The k-vector has a sometimes- complicated frequency dependence through the 
functional form of n[a)). If we retain only the first two terms in this expansion 
then (7.31) becomes 



E(r„ + Ar,f) = 



oc 

— f 

\/2ji J 



„, , i k(£t) )+P| (fti-ftin) -Ar-ftjf , 

E(r ,o))e U <><»W I I da) 



= e 



i \k{o) )-Q) 



= e 



[k(ft)o)-Ar-ft)o f 



— = / E(r ,w)e I Stullu o Ida) 

\/2n J 

-00 

00 

'] J_ 7 

\f2~K J 



E(r ,to) e 



■ih)(t-t') 



(7.34) 



where in the last line we have introduced the definition 



t' = 



dk 



•Ar = 



dRe{k) 



(Do 



da) 



■ Ar 



(7.35) 



0)0 



and assumed that the imaginary part of k is roughly constant near a) so that f 
is real. Then the integral in (7.34) is recognized as the Fourier transform of the 
original pulse with a new time argument: 



E(r + Ar, t) = E(r , t- t') e 



A i(k(ft)o)-Ar-ft) t') 



(7.36) 
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Notice that (7.32) from Example 7.6 agrees with this result, since k vac (to ) • Ar = 
(o z/c. The second factor in (7.36) merely gives an overall phase shift due to prop- 
agation. The phase shift is dictated by the phase velocity of the carrier frequency 
(see (7.9)): 

v p (<u ) = y^- (7.37) 

Otherwise (7.36) is unaltered except for a delay f, the time required for the pulse 
to traverse the displacement Ar. 

The function dRek /d<o ■ Ar is known as the group delay function, and in (7.35) 
it is evaluated only at the carrier frequency a> . Traditional group velocity is 
obtained by dividing the displacement Ar by the group delay time f to obtain 



dRe{k(a))} 

v e {(O ) = 



8 dco 



(7.38) 

O>0 



Group delay (or group velocity) essentially tracks the center of the packet. 

In our derivation we have assumed that the phase delay k(w) • Ar could be well- 
represented by the first two terms of the expansion (7.33) . While this assumption 
gives results that are often useful, higher-order terms can also play a role. In 
section 7.5 we'll find that the next term in the expansion controls the rate at which 
the wave packet spreads as it travels. We should also note that there are times 
when the expansion (7.33) fails to converge (when a> is near a resonance of the 
medium), and the above expansion approach is not valid. We'll analyze pulse 
propagation in these sticky situations in section 7.6. 



7.5 Quadratic Dispersion 

A light pulse traversing a material in general undergoes dispersion when different 
frequency components propagate with different phase velocities. As an example, 
consider a short laser pulse traversing an optical component such as a lens or 
window, as depicted in Fig. 7.8. The short light pulse can broaden in time 5 with 
the different frequency components becoming separated (often called stretching 
or chirping). If absorption (and surface reflections) can be neglected, then the 
amplitude of E(r,<y) does not change - only its phase changes - and the power 
spectrum (7.21) remains unaltered. 

Consider light traveling in a material such as glass. Let the plane-wave com- 
ponents all point in the z-direction. We place r at the start of the glass where we 
assign z = 0, so that k • Ar = kz. Let the polarization of the field be the same for all 
frequencies. 

To find the waveform at the new position z (where the pulse presumably 
has just exited the glass), we take the inverse Fourier transform (7.31). However, 
before doing this we must specify the function k (w). In general, with the exact 
functional form for the index the inverse Fourier transform can only be performed 



5 See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.9 (New York: John Wiley, 1999). 



184 



Chapter 7 Superposition of Quasi-Parallel Plane Waves 



numerically. For our present purposes, we again resort to an expansion of the 
type (7.33), but this time we will keep one additional term: 



k{co)z = k z+ Vg 1 {(O-(O ) z+ a{(O-<o ) z + - 



where 



k = k (.(o ) 



v„ 



dk 
dco 



1 d 2 k 
a = - — r 

2 dco 2 



co n{co ) 



n{co ) co n'{co ) 

H 



n' {(Op) + co n" (w ) 



OJ 



2c 



(7.39) 

(7.40) 
(7.41) 

(7.42) 



25 fs 

A. 



56 fs 



Figure 7.8 A 25 fs pulse traversing 
an ( — 1 cm piece of BK7 glass. 



As before, we have supposed that the imaginary part of the index is negligible. 

Unfortunately, we can't calculate a general formula for the affect of quadratic 
dispersion on an arbitrary initial pulse. However, we can get a general idea for 
how quadratic dispersion works by considering the specific example of a Gaussian 
pulse. 

Example 7.7 

A Gaussian waveform similar to that in Example 7.6 propagates throught a piece of 
glass with thickness Ar = z. Compute the waveform exiting the glass. 

Solution: Again, the Fourier transform of the Gaussian pulse before propagation 
is given by (7.25): 



E(0, CO) = TE e~ 



With the aid of expansion (7.39), the inverse Fourier transform (7.31) (which yields 
the pulse after propagation) becomes 

oo 2 2 

E(z,t) = — f EoTe' ^'V ' e i>c<>z+iVg\»-o>o)z+ma>-a>ofz e -i<ot d(]) 



TV ff'(koZ-COot) n 

lc --' 1 I e -{T 2 l2-iaz){.(D-(D a Y e iv g l {a>-w a )z-i{ui-a>a)t d()) 



2n 

— OO 

We can avoid considerable clutter if we change variables to w' = w- w . Then the 
inverse Fourier transform becomes 



• r i +i 00 

TV /jUkoZ- tt>o fj n 2 

E(Zj t) = / e -^(l-i2az/T^-i(t-zlv g )oj' d0) , (7 44) 

V2n J 
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The above integral can be performed with the aid of (0.55). The result is 



E(z, t) = 



TE e'ikoz-iaot) 



= E e 



2n 



i(kvz—(x)Qt) 



[\-i2azlT 2 ] 



4^-(l-/2az/r 2 ) 



| tan" 1 



(7.45) 



2T 2 [l+(2az/T 2 ) 1 



(l+j'2az/r 2 ) 



^\ + {2azlT 2 f 

Next, we spruce up the appearance of this rather cumbersome formula as follows: 



E (z, t) ■ 



(>- zll >g) . [t-zlvg] 



\/f{z)IT 



-g 2f 2 (z) g 2f 2 (. 



f2 °f m[z)+Hkoz-io t)+i^taD '4>(z) 



where 



and 



2a 

®(z) = — 2 z 



f{z) = T\/l + Q> 2 (z) 



(7.46) 

(7.47) 
(7.48) 



We can immediately make a few observations about (7.46). First, note that 
at z = (i.e. zero thickness of glass), (7.46) reduces to the input pulse E (0, t) = 



, as it should. Secondly, the peak of the pulse moves at speed v g 



since the factor e~^~ zlv ^ /2 ^ 2 < z > controls the pulse amplitude, while the other 
terms (multiplied by i) in the exponent of (7.46) merely alter the phase. Also 
note that the duration of the pulse increases and its peak intensity decreases as it 
travels, since f (z) increases with z. In P7.8 we will find that (7.46) also predicts 
that for large z, the field of the spread-out pulse oscillates less rapidly at the begin- 
ning of the pulse than at the end (assuming a > 0). This phenomenon, known as 
pulse chirping, means that red frequencies get ahead of blue frequencies during 
propagation since the red frequencies experience a lower index of refraction. 

While Example 7.7 is worked out for the specific case of a Gaussian pulse, 
the results are qualitatively similar for all pulses. The exact details vary with 
pulse shape, but all short pulses eventually broaden and chirp as they propagate 
through a dispersive medium such as glass. Higher-order terms in the expansion 
(7.33) that were neglected cause additional spreading, chirping, and other defor- 
mations to the pulses as they propagates. The influence of each order becomes 
progressively more cumbersome to study analytically. In that case, it is easier to 
perform the inverse Fourier transform numerically; there is no need to resort to 
the expansion of k(<u) if the integration is done numerically. 

7.6 Generalized Context for Group Delay 

The expansion of k {a>) in (7.33) is inconvenient if the frequency content (band- 
width) of a waveform encompasses a substantial portion of a resonance structure. 
In this case, it becomes necessary to retain a large number of terms in (7.33) to 




Figure 7.9 Animation of a 
Gaussian-envelope pulse (elec- 
tric field) undergoing dispersion 
during transit. 




Figure 7.10 Real and imaginary 
parts of the refractive index for an 
absorptive medium. 
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Initial Pulse Final Pulse 

r Detector I r + Ar Detector 







Propagation through a distance Ar 
Figure 7.11 Transit time defined as the difference between arrival time at two points. 

describe accurately the phase delay k (<y) • Ar. Moreover, if the bandwidth of the 
waveform is wider than the spectral resonance of the medium, the series alto- 
gether fails to converge. These difficulties have led to the traditional viewpoint 
that group velocity loses meaning for broadband waveforms near a resonance. In 
this section, we study a broader context for group velocity (or rather its inverse, 
group delay dk/da)), which is always valid, even for broadband pulses where the 
expansion (7.33) utterly fails. The analysis avoids the expansion and so is not 
restricted to a narrowband context. 

We are interested in the arrival time of a waveform (or pulse) to a point, say, 
where a detector is located. The definition of the arrival time of pulse energy 
need only involve the Poynting flux (or the intensity), since it alone is responsible 
for energy transport. To deal with arbitrary broadband pulses, the arrival time 
should avoid presupposing a specific pulse shape, since the pulse may evolve 
in complicated ways during propagation. For example, the pulse peak or the 
midpoint on the rising edge of a pulse are poor indicators of arrival time if the 
pulse contains multiple peaks or a long and non-uniform rise time. 

For the reasons given, we use a time expectation integral (or time 'center-of- 
mass') to describe the arrival time of a pulse: 

oo 

/ tl{r,t)dt 

(t)r^ (7.49) 

f I{r,t) dt 

-oo 

For simplification, we have assumed that the light travels in a uniform direction 
by using intensity rather than the Poynting vector. 

Consider a pulse as it travels from point r to point r = r + Ar in a homoge- 
neous medium. The difference in arrival times at the two points is 

Af=U) r -U) ro (7.50) 

The pulse shape can evolve in complicated ways between the two points, spread- 
ing with different portions being absorbed (or amplified) during transit as de- 
picted in Fig. 7.11. Nevertheless, (7.50) renders an unambiguous time interval 
between the passage of the pulse center at each point. 
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This difference in arrival time can be shown to consist of two terms (see 
P7.ll): 6 

Af = Af G (r) + A^(r ) (7.51) 

The first term, called the net group delay, dominates if the field waveform is 
initially symmetric in time (e.g. an unchirped Gaussian). It amounts to a spectral 
average of the group delay function taken with respect to the spectral content of 
the pulse arriving at the final point r = r + Ar: 



Af G (r) 



oo , . 

/ /(r,w)te.4r dw 

-oo * ' 

oo 

/ I{r,cj) da> 



(7.52) 



where 7(r,o>) is given in (7.21). The two curves in Fig. 7.12 show /(r , (x)) (before 
propagation) and / (r, a>) (after propagation) for an initially Gaussian pulse. As 
seen in (7.52), the pulse travel time depends on the spectral shape of the pulse at 
the end of propagation. 

Note the close resemblance between the formulas (7.49) and (7.52). Both are 
expectation integrals. The former is executed as a 'center-of-mass' integral on 
time; the latter is executed in the frequency domain on dRek- Ar/dw, the group 
delay function (7.38). The group delay at every frequency present in the pulse 
influences the result. If the pulse has a narrow bandwidth in the neighborhood 
of a) , the integral reduces to dRek/dw| Wo • Ar, in agreement with (7.38) (see P7.9). 
The net group delay depends only on the spectral content of the pulse, indepen- 
dent of its temporal organization (i.e., the phase of E (r, a)) has no influence). Only 
the real part of the k-vector plays a direct role in (7.52). 

The second term in (7.51) is the reshaping delay IS.tR. It represents a delay 
that arises solely from a reshaping of the spectral amplitude. Often this term is 
negligible. The term takes into account how the pulse time center-of-mass shifts 
as portions of the spectrum are removed (or added), as illustrated in Fig. 7.13. It 
is computed at r before propagation takes place: 7 



Affl(r )= <f> 



r o I altered 



ro 



(7.53) 



Here <f)r represents the usual arrival time of the pulse at the initial point r , 
according to (7.49). The intensity at this point is associated with a field E (r , t) 
whose spectrum is E (r , cj). On the other hand, (t) ro | altered * s me arrrvai time of 
a pulse with modified spectrum E (r , cj) e" Imk Ar . Notice that E (r , (o) g _Irnk Ar i s 
still evaluated at the initial point r . Only the spectral amplitude (not the phase) 
is modified, according to what is anticipated to be lost (or gained) during the trip. 
In contrast to the net group delay, the reshaping delay is sensitive to how a pulse 



6 M. Ware, S. A. Glasgow, and J. Peatross, "The Role of Group Velocity in Tracking Field Energy in 
Linear Dielectrics," Opt. Express 9, 506-518 (2001). 

7 The reshaping delay can instead be computed after propagation takes place, in which case the 
net group delay should be computed with the initial rather than final spectrum. 
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Figure 7.12 Normalized power 
spectrum of a broadband pulse 
before and after propagation 
through an absorbing medium 
with the complex index shown in 
Fig. 7.10. The absorption line eats 
a hole in the spectrum. 
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Figure 7.13 The center of a 
chirped pulse can shift owing 
to the reshaping effect when spec- 
trum is removed. 
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15 30 25 



Figure 7.14 Animation compar- 
ing narrowband vs. broadband 
Gaussian pulses traversing an 
absorbing slab (green stripe) on 
resonance. Note the logarithmic 
scale. See Example 7.8. 



is organized. The reshaping delay is negligible if the pulse is initially symmetric 
(in amplitude and phase) before propagation. The reshaping delay also goes to 
zero in the narrowband limit, and the total delay reduces to the net group delay. 

Example 7.8 

Find the time required for a Gaussian pulse (7.23) to traverse a slab of absorption 
material (neglecting possible surface reflections). Let the material response be 
described by the Lorentz model described in section 2.2 with the carrier frequency 
of the pulse a> , coinciding with the material resonance frequency. Let the slab 
have thickness Ar = cy _1 / 10 and absorption strength (i) 2 p = lOy. 

Solution: The spectrum of the initially Gaussian pulse is given by (7.25), and its 
power spectrum is 8 

7(r ,w)oc e - T2[w - mo)2 
After propagating from ro to r = ro + Ar, the power spectrum becomes 

The net group delay is then 

/ /fro,) « )da> f e -^„,V^A,- „ + dw 

— oo &i — oo 



Af G (r) = Ar- 



f I{r,u))d(i) 



The index of refraction n + ix is given by (2.39) (see also (2.27) and (2.29)). Since 
the expressions for n and k are complicated, the integration in the above formula 
must be performed numerically. 

The result when T=T\ = 10y _1 Is/2 (narrowband) is 

At G = -5.1/ j = -51Ar/c = -0.72Ti 
and the result when T - T2 = y -1 / v2 (broadband) is 

Ate — 0.67 1 y = 6.7Ar/c = 0.95 T 2 
The reshaping delay 7.53 in both cases is negligible. 

The narrowband pulse (with duration 7\) in Example 7.8 traverses the ab- 
sorbing medium superluminally (i.e. faster than c). The negative transit time 
means that the 'center-of-mass' of the exiting pulse emerges even before the 
'center-of-mass' of the entering pulse reaches the medium! On the other hand, 
the broadband pulse (with the shorter duration T2) has a large positive delay time, 
indicating that the exiting pulse emerges subluminally. 



In general, one should write w to distinguish the carrier frequency of the pulse from the 
resonance frequency of the material w ; in practice, these are often different. 
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Figure 7.14 shows the intensity profiles for these two pulses as they traverse 
the absorption slab, calculated with the aid of (7.31). By eye, one can see how 
the centers of the two pulses are either advanced or delayed as they go through 
the absorption medium. In both cases, the pulse that emerges is well within 
the envelope of the original pulse propagated forward at c. In the case of the 
broadband pulse, the absorption peak eats a hole in the center of the spectrum 
as shown in Fig. 7.12, causing the emerging pulse to be distorted in time. The 
analysis in this section predicts the center of pulses, whereas to see the shape of 
pulses one needs to calculate (7.31). 

The results for the two pulse durations in Example 7.8 indicate a trend. Su- 
perluminal behavior only occurs for long boring pulses. In the case of a single 
absorption resonance, this comes with a severe cost of attenuation. Figure 7.15 
shows the delay time as a function of pulse duration. As the injected pulse be- 
comes more sharply defined in time, the superluminal behavior does not persist. 
Sharply defined waveforms (i.e. broadband) cannot propagate superluminally 
precisely because much of their bandwidth lies away from the frequencies with 
superluminal group delays. 

We should mention that superluminal propagation cannot persist for indefi- 
nite distances since the medium eventually removes the superluminal spectral 
components through absorption (or else adds subluminal spectral components 
in the case of amplification) . This limits the amount that a pulse center can be 
advanced — on the scale of the pulse's own duration. 

As we saw for the absorption situation the exiting pulse is tiny and resides 
well within the original envelope of the pulse propagated forward at speed c, 
as depicted in Fig. 7.16. Without the absorbing material in place, the signal 
would be detectable just as early. This statement is also true for any spectral 
behavior of a medium, including amplifying media, you can use the Lorentz 
model (2.40) to describe an amplifying medium with a negative oscillator strength 
/. Figure 7.17 shows narrowband and broadband pulse traversing an amplifying 
medium. In this case, superluminal behavior occurs for spectra near by but not 
on an amplifying resonance. If the pulse is too broadband, its spectrum will be 
amplified, which adds slower components to the overall group delay. 

Appendix 7.A Pulse Chirping in a Grating Pair 

Grating pairs can be used to introduce large amounts of dispersion into a light 
pulse. Gratings are especially useful for amplification of ultrashort laser pulses, 
where laser pulses are first stretched in time before amplification (to prevent 
damage to the amplifier) and then compressed back to short duration just before 
the experiment (called chirped pulse amplification). Diffraction from a grating 
causes each k-vector to travel at a different angle. A second grating parallel to the 
first can realign all of the k-vectors to be parallel to each other. Since laser beams 
are not infinitely wide, the light is typically sent through the grating pair twice 
to undo the tendency of the different frequency components becoming laterally 




Figure 7.15 Delay as a function of 
pulse duration. 




Figure 7.16 Narrowband pulse 
traversing an absorbing medium. 




Figure 7.17 Animation compar- 
ing narrowband vs. broadband 
Gaussian pulses traversing an am- 
plifying slab (green stripe) slightly 
off resonance. 
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First 
Grating 



Second 
Grating 




Figure 7.18 Direction of k-vector 
between parallel gratings (top 
view) . Grating rulings run in and 
out of the page. 



separated. In the present analysis, we will consider an infinitely wide plane wave 
pulse incident upon grating. The scenario is depicted in Fig. 7.19: A short plane 
wave pulse strikes the grating at an angle, and a spreading pulse emerges. 

Consider a plane-wave pulse that ricochets between a pair of parallel grating 
surfaces. Although different k-vectors point with different angles, they are all 
straightened out upon diffracting from the second grating. For simplicity, we will 
consider a pulse just before the first bounce and just after the second bounce, 
even though we are interested in the dispersion that takes place between the 
gratings. Therefore, we can consider all k-vectors as being parallel with each 
other. 

Consider the a plane wave incident on a grating at an incident angle 6\ with 
respect to the grating normal (aligned with the x-axis in our coordinate system) 
as depicted in Fig. 7.18. The plane wave diffracts from the first grating at an angle 
f? r (also referenced from the grating normal). This angle is governed by the grating 
diffraction formula 9 



6 r [a)) = sin 



2nc 

— --sin0,- 
^ (i)d 



(7.54) 



where d is the grating groove spacing. By examining the geometry of the figure, 
we see that the reflected k-vector is given by k= (xcosf3 r + ysint3 r )<y/c. 

Suppose we know the pulse at a point r on the first grating. Next we choose a 
point r + Ar on the second grating where we will determine the outgoing pulse. 
Since we are considering an infinitely wide plane-wave pulse, it doesn't matter 
where we choose that point as long as it lies on the surface of the second grating. 
The waveform will be the same everywhere along the surface of the second gratin, 
only its arrival time will trivially differ. For convenience, we might as well take the 
second point to be r + Ar = r + Lx as shown in Fig. 7.18. 

The phase delay needed for (7.30) becomes 



La) 

k(w)-Ar= — cos0 r 
c 



(7.55) 



We will express this as a Taylor-series expansion similar to (7.39) so that we can 
perform the inverse Fourier transform analytically. We will approximate (7.55) as 



k{a))- Ar w k L+ v„ l {co - co ) L + a (to - a) ) 2 L + • 



(7.56) 



so that we can take advantage of formula (7.46) . To calculate the terms in this 
expansion we will need the derivative of Q x : 

d9 r _ 1 I 2nc 

u 2 dl ^i-sin 2 ' 



2nc 
a) 2 d 



2nc sin6>i + sinf9 r 

(i) z dcosQ r wcosdr 

The derivatives necessary for the Taylor's series expansion are 
9 This formula is equivalent to dsinS; + dsinfl r = X with A = 2ticIw. 



(7.57) 
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and 



d 2 k 

dw 2 



din L ( uoy 

Ar= — cosf? r - <usin0 r 

da> c \ dco 

L ( „ sin0; + sin0 r 
- — cos0 r + sint> r 

C \ COS0 r 

L 1 1 + sin0 r sin0i) 

COS0 r 



Ar = 



LI sin r ( 1 + sin 6 T sin Q\) \ dd T 

~ sm9i + TB ~T~ 

c \ cos^-tix i da) 

L f sin0i + sin0 r W sin0i + sin0 r 



cos 2 b r 



L (sin0i + sin0 r ) 2 



(UCOS0 r 



IOC 



cos 3 r 



The coefficients in (7.56) then are 

k = k i .^-^ 



(7.58) 



(7.59) 



(7.60) 



_2 _ dk 
8 da) 

1 d 2 k 



0) 



a 



2 da) 2 



Ar 

T 

Ar _ 

T 



I + sin0 r sin0i 



CCOS0 r 



(sin0j + sin0 r 



2cwcos 3 9 r 



(1) 



(7.61) 



(7.62) 



In the case of a Gaussian pulse, we can employ (7.46), where L takes the place of 
z, and k , v~ l and a are defined by (7.60) - (7.62). The duration of the pulse is 
controlled by (7.62) and the spacing between the gratings L. 



Appendix 7.B Causality and Exchange of Energy with the 
Medium 

The group delay function indicates the average arrival of field energy to a point. 
Since this is only part of the whole energy story there is no problem when it 
becomes superluminal. The overly rapid appearance of electromagnetic energy at 
one point and its simultaneous disappearance at another point merely indicates 
an exchange of energy between the electric field and the medium. 10 

We should not be dazzled by the magician who invites the audience to look 
only at the field energy while energy transfers into and out of the 'unwatched' 
domain of the medium. Extra field energy seems to appear 'prematurely' down- 
stream only if there is already non-zero field energy downstream to stimulate a 




Figure 7.19 Animation showing a 
short plane-wave pulse diffracting 
from a grating positioned along 
the left edge of the frame. 



M. Ware, S. A. Glasgow, and J. Peatross, "Energy Transport in Linear Dielectrics," Opt. Express 9, 
519-532 (2001). 
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transfer of energy from the medium. The actual transport of energy is strictly 
bounded by c; superluminal propagation of a signal front is impossible. 

In accordance with Poynting's theorem (2.51), the total energy density stored 
in an electromagnetic field and in a medium is given by 

u{r, t) = Wfieid (r, t) + u med (r, t) + u (r, -oo) (7.63) 

where the time-dependent accumulation of energy transferred into the medium 
from the field (ignoring possible free current Jf ree ) is 



Wmed (r, t) = 




(7.64) 



-oo 



The expression (7.63) for the energy density includes all (relevant) forms of energy, 
including a non-zero integration constant u (r, -oo) corresponding to energy 
stored in the medium before the arrival of any pulse (important in the case of an 
amplifying medium). Wfi e id( r > t) and u meA {r, t) axe both zero before the arrival of 
the pulse (i.e. at t = -oo). In addition, Ufi e id( r > t), given by (2.53), returns to zero 
after the pulse has passed (i.e. at t = +oo). 

As K me d increases, the energy in the medium increases. Conversely, as u med 
decreases, the medium surrenders energy to the electromagnetic field. While it is 
possible for M med to become negative, the combination u med + u (-oo) (i.e. the net 
energy in the medium) can never go negative since a material cannot surrender 
more energy than it has to begin with. 

Poynting's theorem (2.51) has the form of a continuity equation which when 
integrated spatially over a small volume V yields 

j)S-da=-y t j udV (7.65) 

A V 

where the left-hand side has been transformed into an surface integral (via the 
divergence theorem (0.1 1)) representing the power leaving the volume. Let the 
volume be small enough to take S to be uniform throughout V. 

We can define an energy transport velocity (directed along S) as the effective 
speed at which all of the energy density would need to travel in order to achieve 
the Poynting flux: 

v B =- (7.66) 
u 

Note that this ratio of the Poynting flux to the energy density has units of velocity. 
When the total energy density u is used in computing (7.66), the energy transport 
velocity has a fictitious nature; it is not the actual velocity of the total energy 
(since part is stationary), but rather the effective velocity necessary to achieve 
the same energy transport that the electromagnetic flux alone delivers. If we 
reduce the denominator to the subset of the energy that can move, namely u^u, 
the Cauchy-Schwartz inequality (i.e. a 2 + p 2 > 2ap) ensures an energy transport 
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velocity Ve remains strictly bounded by the speed of light in vacuum c. The total 
energy density u is at least as great as the field energy density u^m- Hence, this 
strict luminality is maintained. 

Centroid of Energy 

Consider a weighted average of the energy transport velocity: 

fv E ud 3 r fSd 3 r 

<V £ > = V A3, = 77^3" (7 - 67) 

J ud i r J ud 6 r 
where we have substituted from (7.66). 
Integration by parts leads to 

<v £ > = - J - -. -5 = . d * (7.68) 
J u d i r J u a s r 

where we have assumed that the volume for the integration encloses all energy in 
the system and that the field near the edges of this volume is zero. Since we have 
included all energy, Poynting's theorem (2.51) can be written with no source terms 
(i.e. V -S+duldt-G). This means that the total energy in the system is conserved 
and is given by the integral in the denominator of (7.68). This allows the derivative 
to be brought out in front of the entire expression giving 

d(r) frud 3 r 
<v B > = — where <r> = J . (7.69) 
at j u a 6 r 

The latter expression represents the 'center-of-mass' or centroid of the total en- 
ergy in the system, which is guaranteed to evolve strictly luminally since v E is 
everywhere luminal. 11 



It is enlightening to consider u med within a frequency-domain context. In an 
isotropic medium, the polarization for an individual plane wave can be written in 
terms of the linear susceptibility defined in (2.16): 

P(r,w) = e X(r,w)E(r,w) (7.70) 

We can use this to express u med in terms of the electric field and material suscepti- 
bility. 



Although (7.69) guarantees that the centroid of the total energy moves strictly luminally, there is 
no such limitation on the centroid of field energy alone. The steps leading to (7.69) are not possible 
if Mfleld ' s used in place of u. Explicitly, that is 

/ S \ d JrM fleld rf 3 r 
\ Wfleid/ dt |u field d 3 ;' 

As was pointed out, the left-hand side is strictly luminal. However, the right-hand side can easily 
exceed c as the medium exchanges energy with the field. In an amplifying medium, for example, the 
rapid appearance of a pulse downstream can occur when the leading portion of a pulse stimulates 
energy already present in the medium to convert to the form of field energy. Group velocity is 
related to this method of accounting, which is why it also can become superluminal. 
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Expressing u med in terms of the power spectrum 

The field E(r, t) can be expressed as an inverse Fourier transform (7.18). Similarly, 
the polarization P can be written as 12 

oo oo 

P(r, f) = — L= f P(r,a>)e- iwt dcj=> 5P(r,f) - _ZL f toPfcto) e~ iolt dco (7.71) 
V2n J dt ^/2n J 

— oo -oo 

The energy density in the medium (7.64) can then be written as 



oo oo oo 

iWdtr.oo) = ( -!= [ E(r,&/) e~' wt ' dco' ■ f cor (r, co) E(r, CO) e~ iolt ' da> 

J V2n J V2n J 



dt' 
(7.72) 



where we have incorporated (7.70) and evaluated u med after the pulse is over at 
t — oo. We may change the order of integration and write 



CO 

M med (r,oo) = -ie, 

-co 



oo oo oo 

J dcocox{r,o)E{r,c»)- J do)'E(r,to')^ J e" ! ^ w '*'dr' 



(7.73) 

The final integral is a delta function a delta function similar to (0.54), which allows 
the middle integral also to be performed. The expression for w med then reduces to 



oc 

, med (r,oo) = -ie. f ^M)EM).E(r,-.) da> (7.74) 

-oo 

In this derivation, we take E(r, t) and P(r, f) to be real functions, so we can employ 
the symmetry (7.29) along with 

P* (r,<u) = P(r, -co) and %* (r,w) = % (r, -co) . 

Then we obtain 

oo 

, med (r,oo) = £o / rimzto»W.»)-*M do, (7.75) 



The expression (7.75) describes the net energy density transfered to a point 
in the medium after all action has finished (i.e. at t = oo). It involves the power 
spectrum of the pulse. We can modify this formula in an intuitive way so that it 
describes the transfer of energy density to the medium for any time during the 
pulse. 

Since the medium is unable to anticipate the spectrum of the entire pulse 
before experiencing it, the material responds to the pulse according to the history 
of the field up to each instant. In particular, the material has to be prepared for 



12 We assume that the real forms of the fields in the time domain are used for the sake of this 
multiplication. 
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the possibility of an abrupt cessation of the pulse at any moment, in which case 
all exchange of energy with the medium immediately ceases. In this extreme sce- 
nario, there is no possibility for the medium to recover from previously incorrect 
attenuation or amplification, so it must have gotten it right already. 

If the pulse were in fact to abruptly terminate at a given instant, it would 
not be necessary to integrate the inverse Fourier transform (7.19) beyond the 
termination time t after which all contributions are zero. Causality requires that 
the medium be indifferent to whether a pulse actually terminates if that possibility 
lies in the future. Therefore, (7.75) can apply for any time t (not just for t = oo) 
if the spectrum (7.19) is evaluated just for that portion of the field previously 
experienced by the medium (up to time t). 

The following is then an exact representation for the energy density (7.64) 
transferred to the medium: 



i(r, f) 



£- J wlmj(r, 



d))E t (r,w) 



■E* t {r,(o) da) 



(7.76) 



where 



E t {r,a)) 



dt' 



(7.77) 



This time dependence enters only through E f (r, cS) -E* (r,ai), known as the instan- 
taneous power spectrum. 

The expression (7.76) gives physical insight into the manner in which causal 
dielectric materials exchange energy with different parts of an electromagnetic 
pulse. Since the function E t (<y) is the Fourier transform of the pulse truncated 
at the current time t and set to zero thereafter, it can include many frequency 
components that are not present in the pulse taken in its entirety. This explains 
why the medium can respond differently to the front of a pulse compared to the 
back. Even though absorption or amplification resonances may lie outside of 
the spectral envelope of a pulse taken in its entirety, the instantaneous spectrum 
on a portion of the pulse can momentarily lap onto or off of resonances in the 
medium. 

In view of (7.76) and (7.77) it is straightforward to predict when the electro- 
magnetic energy of a pulse will exhibit superluminal or subluminal behavior. In 
section 7.5, we saw that this behavior is controlled by the group velocity function. 
However, with (7.76) and (7.77), it is not necessary to examine the group velocity 
directly, but only the imaginary part of the susceptibility % (r, <^)- 

If the entire pulse passing through point r has a spectrum in the neighborhood 
of an amplifying resonance, but not on the resonance, superluminal behavior 
can result. The instantaneous spectrum during the front portion of the pulse is 
generally wider and can therefore lap onto the nearby gain peak. The medium 
accordingly amplifies this perceived spectrum, and the front of the pulse grows. 
The energy is then returned to the medium from the latter portion of the pulse 
as the instantaneous spectrum narrows and withdraws from the gain peak. The 




-5 5 

{(D-(i) )/y 

Figure 7.20 Real and imaginary 
parts of the refractive index for an 
amplifying medium. 
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Figure 7.21 Animation of a nar- 
rowband pulse traversing an am- 
plifying medium off resonance. 
The black dot shows the move- 
ment of the center of all energy. 
The red line inside the medium 
shows the energy held in that 
medium, which cannot go nega- 
tive. The lower figure shows the 
instantaneous spectrum of the 
pulse at the front of the medium 
relative to the narrow amplifying 
resonance. 



effect is not only consistent with the principle of causality, it is a direct and general 
consequence of causality as demonstrated by (7.76) and (7.77). 

As an illustration, consider the broadband waveform with T2 = J~ l lsf2 de- 
scribed in Example 7.8. Consider an amplifying medium with index shown in 
Fig. 7.20 with the amplifying resonance (negative oscillator strength) set on the 
frequency a) = d) + 2y, where a) is the carrier frequency. Thus, the resonance 
structure is centered a modest distance above the carrier frequency, and there is 
only minor spectral overlap between the pulse and the resonance structure. 

Superluminal behavior can occur in amplifying materials when the forward 
edge of a narrow-band pulse receives extra amplification. Fig. 7.21 shows how the 
early portion of a pulse has a wide instantaneous spectrum computed by (7.77) 
that can lap onto the amplifying resonance. As the wings grow and access the 
neighboring resonance, the pulse extracts more energy from the medium. As the 
wings diminish, the pulse surrenders much of that energy back to the medium, 
which shifts the center of the pulse forward. 

In this appendix we have indirectly proven that a sharply defined signal edge 
cannot propagate faster than c. If a signal edge begins abruptly at time to, the 
instantaneous spectrum E t [a)) clearly remains identically zero until that time. In 
other words, no energy may be exchanged with the medium until the field energy 
from the pulse arrives. Since, as was pointed out in connection with (7.66), the 
Cauchy- Schwartz inequality prevents the field energy from traveling faster than c, 
at no point in the medium can a signal front exceed c. 

Appendix 7.C Kramers-Kronig Relations 

In the late 1920s, of Ralph Kronig and Hendrik Kramers independently discovered 
a remarkable relationship between the real and imaginary parts of a material's 
susceptibility % Recall that the susceptibility as defined in (2.16) relates the 
polarization of a material to the field that stimulates the medium: 



PM=e |MEM 



(7.78) 



They made an argument based on causality (i.e. effect cannot precede cause), 
which allows one to obtain the real part of % (<y) from the imaginary part of % M, 
if it is known for all co. Similarly, one can obtain the imaginary part of % (<y) from 
the real part of % (o»). We develop the Kramers-Kronig formulas below. 13 

We can replace E (w) in (7.78) with the Fourier transform of E (f) in accordance 
with (7.19). In addition, we take the inverse Fourier transform (7.19) of both sides 
of (7.78) and obtain 



PU) = -^= I jM 
\J In 

-00 



Jim -±=J E[t')e^'dt' 



- la)t do) 



(7.79) 



13 



See J. D. Jackson, Classical Electrodynamics, 3rd ed., Sect. 7.10 (New York: John Wiley, 1999). 
Also B. Y.-K. Hu, "Kramers-Kronig in two lines," Am. J. Phys. 57, 821 (1989). 
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Next we interchange the order of integration to get 

oo r oo 



dt' 



(7.80) 



Now for the causality argument: The polarization of the medium P(f) cannot 
depend on the field E [t') at future times t' > t. Therefore the expression in square 
brackets must be identically zero unless t - t' > 0. This places a restriction on the 
functional form of % [od) as we shall see. 

The causality argument comes explicitly into play when we employ the fol- 
lowing integral formula: 14 



-^-^signit-t'}! f e —— 



dcj' 



m J (i)-a)' 

-oo 



(7.81) 



Apparently we require the positive sign since sign{t - 1'} = 



+ 1 {t>t') 
-1 (t< t') ' 

Upon substitution of (7.81) into (7.80) and after changing the order of integra- 
tion within the square brackets we obtain 



oo / 



pw = MhM f if^L* 

2n J J in J h)-(o' 

-oo L-oo V -oo 

For (7.80) and (7.82) to be the same, we require 



e-W-Vdu' 



dt' 



(7.82) 



, , 1 f X [to 
in J (j)' - 



da)' 



CD 



(7.83) 



or 



1 f Rex [<d') + Hmx [cd') 



Rex {(o) + ilmx {oj) = — j — ' ' " A — ' dcJ 
in J cd' -cd 



(7.84) 



Finally, equating separately the real and imaginary parts of the above equation 
yields 



Rex M 



1 r Imr w' , 1 r Rerlw'l 

= - / — i — dcJ and bar {to) = — I , do' (7.85) 
n J cd'-cd 1 n J cd' -cd 



Re* (ft/) 



14 This integral, which is a specific instance of Cauchy's theorem, is tricky because it involves two 
diverging pieces, to either side of the singularity w = a>' . The divergences have opposite sign so that 
they cancel. The integration must approach the singularity in the same manner from either side, in 
which case the result is called the principal value. In practical terms, if the integral is performed 
numerically, the sampling of points should straddle the singularity symmetrically; other sampling 
schemes can change the result dramatically, which is incorrect. 
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These are known as the Kramers-Kronig relations on real and imaginary parts of 
X- 15 If the real part of % is known at all frequencies, we can use the Kramers-Kronig 
relations to generate the imaginary part, and visa versa. We see that the real and 
imaginary parts of % cannot be chosen independently, if we are to respect the 
principle of causality. 

Example 7.9 

Show that the expression in square brackets of (7.80) is zero when t' > t, if % {to) 
satisfies the Kramers-Kronig relations (7.85). 

Solution: The expression may be written as 



UO 

/ 



\ ■[to)e~ Ua ^~^dQ) = J Re^Me-'^C-^dw+i J Imj («) e~ i(a ^~^ do) 



f Re X ((o)e- i ^ t - t 'Uco+i f ~ f ^ 



-oo 
oo 



1 f BexW^ 



- 10 



/r l r g _!w (f _t ) 
Rex(a))e~ iw ^~^d(o+ / Rerfa/) — / da> 
J ITT J (0 —0) 



du)' 
(7.86) 



where we have invoked the Kramers-Kronig relation for Imj (<y) (7.85) and inter- 
changed the order of integration in the final expression. Since we are specifically 
considering future times f > t, we have by (7.81) 



OO . r i\ 

I r e -ico{t-t) 



1 r e~ w 
in J a)' 



daj = -e- i<a '^ 



Hence 



J x^e^-^da =J Rexme-W-^da)- J Rex(«')< 

-CXD —CXD —CXD 

= 



(7.87) 



Finally, it is worth noting that the Kramers-Kronig relations also apply to the 



15 As with (7.81), the principal value of the integral must be calculated. If the integral is performed 
numerically, the sampling of points should straddle the singularity symmetrically. Separately, the 
integral on each side of w' = w diverges, but with opposite sign. 
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real and imaginary parts of the index of refraction (subtract one). 



16 



n [a>) - 1 = 




— / — do' and k{co) 

n J oj'-o) 



71 J 0)'-0) 



1 r n{aj')-l 



doj' 



(7.88) 



-co 



-co 



One can use the Kramers-Kronig relations to find the real part of the index from 
a measurement of absorption, if the measurement is done over a broad enough 
range of the spectrum. This is the most useful form of the Kramers-Kronig rela- 
tions. 

It is sometimes convenient to multiply the numerator and denominator inside 
the integrands of (7.88) by a)' + a). Then noting that n is an even function and 
k is an odd function allows us to dismiss either a)' or to in the numerator and 
integrate 17 over positive frequencies only: 



lfa This follows from Cauchy's theorem since the index (subtract one) is the square root of % (&>)• 
The Kramers-Kronig relations for % (w) guarantee that % (w) has no poles in the upper half complex 
plane, when a> is considered (for mathematical purposes) to be a complex variable. Taking the 
square root does not introduce poles into the upper half plane. 

17 The integrals (7.88) and (7.89) diverge to either side of w' = w, but with opposite sign. Again, 
the principal value of the integral is required, which means a numeric grid should straddle the 
singularity symetrically. 




and 




(7.89) 



o 



o 
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Exercises 



Exercises for 7.1 Intensity of Superimposed Plane Waves 



P7. 1 (a) Consider two counter-propagating fields described by xE\ e l ^ kz ~ wt) 
andx£ , 2^ !( ~ fcz wr) where E\ and E 2 are both real. Show that their sum 
can be written as 

±E [ot {z)e imz) - Mt) 



where 



and 



I Ei) 



+ 4— cos 2 kz 



$ (z) = tan 1 



£1 

{l-E 2 /E 1 



tan kz 



Outside the range - 1 < kz < § the pattern repeats. 

(b) Suppose that two counter-propagating laser fields have separate 
intensities, h and I 2 = ii/100. The ratio of the fields is then E 2 /Ei = 
1/10. In the standing interference pattern that results, what is the ratio 
of the peak intensity to the minimum intensity 1 ? Are you surprised how 
high this is? 

P7.2 Equation (7.7) implies that there is no interference between fields that 
are polarized along orthogonal dimensions. That is, the intensity of 

E(r, t) =±E e im) - r - ait] +yE e i[m r ~ bit] 

according to (7.7) is uniform throughout space. Of course (7.7) does not 
apply since the k- vectors are not parallel. Show that the time- average 
of S (r, t) according to (7.4) exhibits interference in the distribution of 
net energy flow. 



Exercises for 7.2 Group vs. Phase Velocity: Sum of Two Plane Waves 
P7.3 Show that (7.10) can be written as 

E(r,fl = 2E el 2 2 Jcosl — -r-— fl 

From this show that the speed of the rapid-oscillation intensity peaks 
in Fig. 7.2 is v' p = ti)/k where 

- [ki + k 2 ) , _ (W1 + W2) 

k = and to = 

2 2 

P7.4 Confirm the right-hand side of (7.17). 
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Exercises for 7.3 Frequency Spectrum of Light 

P7.5 The continuous field of a very narrowband continuous laser may be 
approximated as a pure plane wave: E(r, t) = E e i(fc ° z ~ Wot) . Suppose the 
wave encounters a shutter at the plane z - 0. 

(a) Compute the power spectrum of the light before the shutter. HINT: 
The answer is proportional to the square of a delta function centered 
on co (see (0.54)). 

(b) Compute the power spectrum after the shutter if it is opened during 
the interval - 772 < t < 772. Plot the result. Are you surprised that the 
shutter appears to create extra frequency components? 

HINT: Write your answer in terms of the sine function defined by 
sinca = sin a la. 

P7.6 (a) Determine the Full-Width-at-Half-Maximum of the intensity (i.e. 
the width of 7(r, t) represented by Af FWHM ) and of the power spectrum 
(i.e. the width of I (r, oj) represented by A<y FWHM ) for the Gaussian pulse 
defined in (7.25). 

HINT: Both answers are in terms of T. 

(b) Give an uncertainty principle for the product of A t FmiM and A^pwhm- 

Exercises for 7.5 Quadratic Dispersion 

P7.7 The intensity of a Gaussian laser pulse has a FWHM duration Tfwhm = 
25 fs with carrier frequency a) corresponding to A vac = 800 nm. The 
pulse goes through a lens of thickness £ = 1 cm (laser quality glass type 
BK7) with index of refraction given approximately by 

n{a>) = 1.4948 + 0.016 — 

What is the full-width-at-half-maximum of the intensity for the emerg- 
ing pulse? 

HINT: For the input pulse we have 

j, _ Tfwhm 
2v / hT2 

(see P7.6). 

P7.8 If the pulse defined in (7.46) travels through the material for a very long 
distance z such that t {z) — >■ T$ {z) and tan -1 <5 (z) — nl2, show that 
the instantaneous frequency of the pulse is 



t-2z/v. 

w + — : ' 

4az 
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COMMENT: As the wave travels, the earlier part of the pulse oscillates 
more slowly than the later part. This is called chirp, and it means that 
the red frequencies get ahead of the blue ones since they experience a 
lower index. 



Exercises for 7.6 Generalized Context for Group Delay 

P7.9 When the spectrum is narrow compared to features in a resonance 
(such as in Fig. 7.10), the reshaping delay (7.53) tends to zero and can 
be ignored. Show that when the spectrum is narrow the net group delay 
(7.52) reduces to 

<9Rek 

lim Af G (r) = — Ar 

T^oo 0(i) a 

P7.10 When the spectrum is very broad the reshaping delay (7.53) also tends 
to zero and can be ignored. Show that when the spectrum is extremely 
broad, the net group delay reduces to 

Ar 

hm Af G (r) = — 

r^o c 

assuming k and Ar are parallel. This implies that a sharply defined 
signal cannot travel faster than c. 

HINT: The real index of refraction n goes to unity far from resonance, 
and the imaginary part k goes to zero. 

P7.ll Work through the derivation of (7.51). 

HINT: This somewhat lengthy derivation can be found in the reference 
in the footnote near (7.51). 



Exercises for 7 A Pulse Chirping in a Grating Pair 

P7.12 A Gaussian pulse with T = 20 fs is incident with 8 Y = 20° on a grating 
pair with groove separation d = 1.67 jum. What grating separation L 
will lead to a pulse duration of T = 100 ps? Assume two passes through 
the grating pair for a total effective separation of 2L. Take the pulse 
carrier frequency to corresponds to A = 800 nm. 



Chapter 8 

Coherence Theory 



Coherence theory is the study of correlations that exist between different parts of 
a light field. In temporal coherence theory, we focus on the correlation between 
the fields at different times, E(r, t) and E(r, t + t). In spatial coherence theory, 
we focus on the correlations between fields at different spatial locations, E(r, t) 
and E(r + Ar, t). Because light oscillations are too fast to resolve directly, we 
usually need to study optical coherence using interference techniques. In these 
techniques, light from different times or places in the light field are brought 
together at a detection point. If the two fields have a high degree of coherence, 
they consistently interfere either constructively or destructively at the detection 
point. If the two fields are not coherent, the interference at the detection point 
rapidly fluctuates between constructive and destructive interference, so that a 
time-averaged signal does not show interference. 

You are probably already familiar with two instruments that measure coher- 
ence: the Michelson interferometer, which measures temporal coherence, and 
Young's two-slit interferometer, which measures spatial coherence. Your pre- 
liminary understanding of these instruments was probably gained in terms of 
single-frequency plane waves, which are perfectly coherent for all separations in 
time and space. In this chapter, we build on that foundation and derive descrip- 
tions that are appropriate when light with imperfect coherence is sent through 
these instruments. We also discuss a practical application known as Fourier spec- 
troscopy (Section 8.4) which allows us to measure the spectrum of light using a 
Michelson interferometer rather than a grating spectrometer. 

8.1 Michelson Interferometer 

A Michelson interferometer employs a 50:50 beamsplitter to divide an initial 
beam into two identical beams and then delays one beam with respect to the 
other before bringing them back together (see Fig. 8.1). Depending on the relative 
path difference d (roundtrip by our convention) between the two arms of the 
system, the light can interfere constructively or destructively in the direction of 
the detector. The relative path difference d introduces a relative time delay t, 





Beam 
Splitter 




7 




T 



Detector 

Figure 8.1 Michelson interferome- 
ter. 
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T = 

Figure 8.2 The intensity seen at 
the detector of a Michelson in- 
terferometer with a plane-wave 
input. Because the plane wave is 
infinitely coherent, the output os- 
cillates forever in both directions. 
Energy is conserved, so when the 
intensity at the detector is zero, 
all of the input light is being sent 
back on the input arm of the inter- 
ferometer. 



defined by t = die. 

If the input light is a plane-wave, the net field at the detector consists of the 
field coming from one arm of the interferometer E e !( ' added to the field 
coming from the other arm E g'( fcz - W ( c - T )) . These two fields are identical except 
for the delay t. The intensity seen at the detector as a function of path difference 
is computed to be 



hot (t) = 



ce 



£ e i(kz-iof) + g g i'(h-ffl(f-r)) 



£ g i[kz-a>fl + ^ e i(kz-oj(t-T)) 



[2E„ • E* + 2E • E* cos(wt)] 



2 

_ ce 
2 

= 2/ [l + COS(£(JT)] 

(Plane Wave Input) (8.1) 

where I = ^E • E^ is the intensity from one beam alone (when the other arm of 
the interferometer is blocked). This formula is probably familiar. It describes how 
the intensity at the detector oscillates between zero and four times the intensity 
of the beam from one arm when the other is blocked, 1 as plotted in Fig. 8.2. 

When light containing a continuous band of frequencies is sent through the 
interferometer, (8.1) no longer holds. Instead of repeating indefinitely, the oscilla- 
tions in the intensity at the detector become less pronounced as t increases. The 
concept of temporal coherence describes how fast fringe visibility diminishes as 
delay is introduced in an arm of the Michelson interferometer. The less coherent 
the light source, the faster the fringes die out as t is increased. To model this 
behavior, we need to expand our analysis beyond (8.1). 

Consider an arbitrary waveform E(f) (comprised of many frequency compo- 
nents) that has traveled through the first arm of a Michelson interferometer to 
arrive at the detector in Fig. 8.1. Again, E(t) is the value of the field at the detector 
when the second arm is blocked. The beam that travels through the second arm 
of the interferometer is identical, but delayed by the round-trip delay t: E (f - t). 
The total field at the detector is the sum of these two fields: 



Et ot (f,T) = E(f)+E(f-T) 
The total intensity J tot at the detector is found using (7.21) with n=l: 

/tot(f,T) = ^E tot (f,T)-E t * ot (f,T) 



(8.2) 



ce 



[E(£)-E*(r)+E(f)-E*(f-T)+E(f-T)-E*U)+E(f-T)-E*U-T)] 



= I{t) + I{t-T) + ^Y [EU)-E*(f-T)+E(f-T)-E*(r)] 
= I{t) + I{t-T) + ce Re{E(f)-E*(f-T)} 

(8.3) 

The function I{t) corresponds to the intensity of one of the beams arriving at the 
detector while the opposite path of the interferometer is blocked. 



^eep in mind that if a 50:50 beam splitter is used, then the intensity arriving to the detector 
from one arm alone (with other arm blocked) is one fourth of the original beam, since the light 
meets the beam splitter twice. 
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For now we treat E (?) as a pulse with a finite duration and energy to simplify 
the math. Later we illustrate how to adapt this analysis for continuous light 
sources. In (8.3) we have retained the t dependence of / tot {t, t) in addition to the 
dependence on the path delay t. This allows for pulses with arbitrary duration 
and shape. The rapid oscillations of the light are automatically averaged away in 
I{t) since we used (7.21), but the slowly varying envelope of the pulse is retained. 

For a pulsed source, the physical signal from a Michelson interferometer is 
proportional to the total amount of pulse energy arriving at the detector as a 
function of t. 2 This physical signal, which we'll denote by Sig(r), is proportional 
to the total energy per area, or fluence, accumulated at the detector: 



Sig(T)oc j I tot (t,T)dt 



(8.4) 



The proportionality constant will depend on the area of the beam, as well as the 
units with which the detector reports Sig(r) (volts, etc.) . We can manipulate the 
fluence integral in (8.4) into a more useful form that will make the coherence 
properties more evident. 

Manipulation of the fluence integral 

Inserting (8.3) in the fluence integral, we have 

oo oo oo oo 

J I tot (t,T)dt= J I{t)dt+ J I{t-r)dt+c£ Q Re J E(f)-E* [t-r)dt (8.5) 

-oo -oo -oo -oo 

The first two integrals on the right-hand side of (8.5) are equal, 3 and give the 
fluence £ from one arm of the interferometer when the other arm is blocked: 



Si 



J I(t)dt= J 



I{t-T)dt 



(8.6) 



The final integral in (8.5) remains unchanged if we take a Fourier transform fol- 
lowed by an inverse Fourier transform: 



oo 

/ 



E(f)-E* (f-r)dt: 



OO 

— f 

s/2n J 



dcoe 



oo oo 

-L= f dre ia)T [ E(f)-E*(f-T)rff 
s/2n J J 



(8.7) 

The reason for this procedure is so that we can take advantage of the autocor- 
relation theorem described in P0.27. With it, the expression in square brackets 



2 For sub-nanosecond laser pulses, a detector automatically integrates the entire energy of the 
pulse since a detector cannot keep up with temporal variations on such a rapid time scale. For 
longer pulses, it may be necessary to force the integration. 

3 Note that the second integral is insensitive to t since a change of variables t' = t— r converts it 
into the first integral. 




Albert Abraham Michelson (1852- 
1931, United States) was born in 
Poland, but he immigrated to the 
US with his parents and grew up in 
the rough mining towns of California 
and Nevada where his father was a 
merchant. Michelson attended high 
school in San Fransisco. He entered 
the US Naval Academy in 1869 (with 
intervention from US President Grant 
after Michelson pleaded his case on 
the grounds near the White House). 
After two years at sea, Michelson re- 
turned to the Naval Academy to teach 
physics and mathematics for several 
years. Michelson was fascinated by the 
problem of determining the speed of 
light, and developed successive exper- 
iments to measure it more accurately. 
He is probably most famous for his ex- 
periment conducted at Case School of 
Applied Science in Cleveland with Ed- 
ward Morley to detect the motion of 
the earth through the ether. Michelson 
later was a professor at the University of 
Chicago and then at Caltech. In 1907, 
he became the first American to win the 
Nobel prize, for his contributions to op- 
tics. Michelson married late in life and 
was the father of four. (Wikipedia) 
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simplifies toy^E (o) ■ E* {(a) = \/2jt2I [(d) lce a . Then with the aid of (8.6) and (8.7), 
the overall fluence (8.5) becomes 

oo 

J I tot {t,T)dt = 2£ 



With (8.8), we can rewrite the physical signal (8.4) in the more useful form 

Sig(T)oc 2^[l + Re{y(T)}] (8.9) 

where the dependence on the path delay t is entirely contained in the degree of 
coherence function y (t): 4 

oo 

/ I{a))e- i0>T d(0 

y(T) = ^^ (8.10) 

/ I{(]))da) 

-oo 

The denominator of (8.10) was rewritten with the help of Parseval's theorem 
<g = f^I^fldt = f^I((o) do). Remarkably, the signal out ofthe Michelson inter- 
ferometer does not depend on the phase of E (<y). It depends only on the amount 
of light associated with each frequency through I (w) = ^E (m) ■ E* (a>) . 

Alternate derivation of (8.9) 

We could have derived (8.9) using another strategy, which may seem more intuitive 
than the approach above. Equation (8.1) gives the intensity at the detector when a 
single plane wave of frequency w goes through the interferometer. Now suppose 
that a waveform composed of many frequencies is sent through the interferometer. 
The intensity associated with each frequency acts independently, obeying (8.1) 
individually. 

The total energy (per area) accumulated at the detector is then a linear superposi- 
tion of the spectral intensities of all frequencies present: 

oo oo 

J 7 tot («,t) da) = J 2I{(/>) [1 + cos(wt)] da) (8.11) 

— oo — oo 

While this procedure may seem obvious, the fact that we can do it is remarkable! 
Remember that it is usually the fields that we must add together before finding the 
intensity of the resulting superposition. The formula (8.11) with its superposition 
of intensities relies on the fact that the different frequencies inside the interferom- 
eter when time-averaged (over all time) do not interfere. Certainly, the fields at 
different frequencies do interfere (or beat in time) . However, they constructively 
interfere as often as they destructively interfere, and in a time-averaged picture it 
is as though the individual frequency components transmit independently. Again, 



1+ -Re 

S 



/ 



I{oj)e~ lmr da) 



(8.8) 



4 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 570 (Cambridge University Press, 1999). 



8.1 Michelson Interferometer 



207 



in writing (8.11) we considered the light to be pulsed rather than continuous so 
that the integrals converge. 

We can manipulate (8.11) as follows: 



/ 



/, , {(i), t) du) = 



I 



2 / I{(D)d(x) 



1+ ■ 



/ / {(1)) cos (ft)T) dw 

oo 

/ I {(i)) dd) 



(8.12) 



This is the same as (8.8) since we can replace cos(wt) with Re{e ift,T }, and we can 
apply Parseval's theorem (8.6) to the other integrals. Thus, the above arguments 
lead to (8.9) and (8.10). 



Example 8.1 

Compute the output signal when a Gaussian pulse with spectrum (7.25) is sent 
into a Michelson interferometer. 



Solution: The power spectrum of the pulse is 5 

7(r, W ) = ^Eo-E*r 2 e - r2( ^° )2 

where T is the pulse duration, not to be confused with t, the delay of the interfer- 
ometer arm. As shown in Example 7.3, we also have 



oo 

/ 



I{r,a>) dw - -^-YLq-YLqT\/ti 



The degree of coherence (8.10) is then 



rw 



oo 

t r 

\fn J 



T 



71 f 



-THco-co ) 2 e -im dw 



2 u) 2 + {2T 2 w a -iT)u)-T 2 w 2 a d(i) . 



T 



n £l "Q-") -t 2 <4 




Figure 8.3 The output or signal 
from a Michelson interferometer 
for light with a Gaussian spec- 
trum. 



- e 4T 2 e -iw r 

Formula (0.55) was used to complete the integration. According to (8.9), the signal 
at the detector is then 



Sig(T)oc 2<?[l + Re{y(T)}] = 2S 



l + e it 2 cos(wot) 



Figure 8.3 shows this signal for a given T. As delay is added (or subtracted), the 
output signal oscillates. Eventually enough delay is introduced such that the 
very short pulses no longer interfere (arriving sequentially), and the output signal 
becomes steady. 



5 Technically, the output intensity is one fourth this, but our calculation of the degree of coher- 
ence is insensitive to amplitude. 
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8.2 Coherence Time and Fringe Visibility 

The degree of coherence function y (t) describes the oscillations in intensity 
at the detector as the mirror in one arm of the interferometer is moved. The 
real part of y(r) is analogous to cos(wt) in (8.1). However, for large delays t, 
the oscillations tend to die off as different frequencies get out of sync — some 
interfere constructively, while others interfere destructively. Narrowband light is 
temporally more coherent than broadband light because there is less opportunity 
for frequencies to get out of sync. Still, for large path differences, the oscillations 
eventually die off, and the time-averaged intensity at the detector then remains 
steady as the mirror is moved further. 

The coherence time t c is the amount of delay necessary to cause y(r) to quit 
oscillating (i.e. its amplitude approaches zero). This definition is not very precise, 
since the oscillations do not usually have an abrupt end, but instead slowly die off 
as t increases. A useful (although arbitrary) analytic definition for the coherence 
time is 

oo oo 

t c = J \ r W\ 2 dT = 2f\ r (T)\ 2 dT (8.13) 

-oo 

The coherence length is the distance that light travels in this time: 

(c = cr c (8.14) 



Another useful concept is fringe visibility. The fringe visibility is defined in 
the following way: 

max[Sig(T)]-min[Sig(T)] 

V(T) = = = = T (8.15) 

max [Sig(r) J + min [Sig(r) J 

where max[Sig(r)] refers to the detector signal when the mirror is positioned 
such that the amount of throughput to the detector is a local maximum, and 
min [Sig(r)] refers to the detector signal when the mirror is positioned such that 
the amount of throughput to the detector is a local minimum. The minimum 
and the maximum don't occur at exactly the same r, but for optical frequencies 
the difference in t is only about half an optical period. As the mirror moves 
a large distance from the equal-path-length position, the oscillations in Sig(r) 
become less pronounced as the max and min tend to the same value, and the 
fringe visibility goes to zero when y (t) = 0. It is left as an exercise (see P8.1) to 
show that the fringe visibility can be written as 6 

V(T) = |y(T)| (8.16) 

Note that the fringe visibility depends only upon the frequency content of the 
light without regard to whether the frequency components are organized into a 
short pulse or a longer time pattern. 



6 M. Born and E. Wolf, Principles of Optics, 7th ed., p. 570 (Cambridge University Press, 1999). 
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Example 8.2 

Find the fringe visibility and the coherence time for the Gaussian pulse studied in 
Example 8.1. 

Solution: By (8.16), the fringe visibility is 

_ * 2 

V(t) = |y(x)| = e ir 7 . 

This is shown as the dashed line in Fig. 8.4. As expected, the fringe visibility dies 
off as delay t gets farther from the origin, the point where the interferometer arms 
are equidistant. From (8.13) the coherence time is 



t c = J \j{r)\ 2 dT- J e 2T^dT-V2nT 

-oo — oo 

which is the delay necessary to cause the fringes to substantially diminish. 




Figure 8.4 Re{y(r)} (solid) and 
|y(r)| (dashed) for a light pulse 
with a Gaussian spectrum as in 
examples 8.1 and 8.2. 



8.3 Temporal Coherence of Continuous Sources 

Consider a continuous light source such as starlight or a continuous wave (CW) 
laser. The integral I{t)dt diverges for such a source, since it is on forever (or 
at least for a very long time) and emits infinite (or very much) energy. However, 
note that the integrals on both sides of (8.5) diverge in the same way. We can 
renormalize (8.5) in this case by replacing the integrals on each side with the 
average value of the intensity: 

772 

4ve = (I{t))t = — J I{t)dt (continuous source) (8.17) 

-772 

The duration T must be large enough to average over any fluctuations that are 
present in the light source. The average in (8.17) should not be used on a pulsed 
light source since the result would depend on the duration T of the temporal 
window. 

For a continuous light source, the signal at the detector (8.9) becomes 

Sig(r) oc 2<I(f)) f [l + Rey(T)] (continuous source) (8.18) 

Although technically the integrals used in (8.10) to compute y (t) also diverge 
in the case of continuous light, the numerator and the denominator diverge in 
the same way. Therefore, we may renormalize / [a)) in any way we like to deal 
with this problem. Both the numerator and denominator of (8.10) contain I{a)), 
so regardless of how large / (a») is or what units the measurement gives (volts 
or whatever), we can just plug the instrument reading directly into (8.10). The 
units in the numerator and denominator cancel so that y (t) always remains 
dimensionless. Once we have the degree of coherence function y(r), we can 
calculate the coherence time and fringe visibility just as we did for pulsed sources. 
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8.4 Fourier Spectroscopy 

As we saw in (8.8), the signal output from a Michelson interferometer for a pulsed 
input may be written as 

oo 

Sig(T)oc 2S + 2Re J I(oj)e~ im dm (8.19) 

-oo 

Typically, the signal comes in the form of a voltage or a current from a sensor. 
However, the signal can easily be normalized to the beam fluence. In particular, 
for large r the fringe visibility goes to zero (i.e. y(r) = 0), and the normalized 
signal must approach 

/oo 
I(t)dt (8.20) 
-oo 

We will assume that this normalization has taken place and write (8.19) as an 
equality. 

Given our measurement of Sig(r), we would like to find the power spectrum 
I(a>). Unfortunately, 1(a)) is buried within an integral in (8.19). However, since the 
integral looks like an inverse Fourier transform of 1(a)), we will be able to extract 
the desired spectrum after some manipulation. This procedure for extracting 1(a)) 
from an interferometric measurement is known as Fourier spectroscopy. 7 



Extracting 1(a)) 

We first take the Fourier transform of (8.19): 8 

^{Sig(T)} = ^{2<g'} + ^ r |2Re J I (to) e~ ia)T dto^ (8.21) 

The left-hand side is known since it is the measured data, and a computer can be 
employed to take the Fourier transform of it. The first term on the right-hand side 
is the Fourier transform of a constant: 

oo 

&{2£} = 2£-! = [ e i0JJ dT = 2gs/2jz8{a>) (8.22) 
V2n J 



Notice that (8.22) is zero everywhere except where to = 0, where a spike occurs. 
This represents the DC component of & {Sig (t)}. 

The second term of (8.21) can be written as 

( oo 1 f oo oo 1 

J^2Re J I{to)e~ i<aT da)\=&\ J I (to) e~ lm du> + J I (to) e iun dcj\ 



7 J. Peatross and S. Bergeson, "Fourier Spectroscopy of Ultrashort Laser Pulses," Am. J. Phys. 74, 
842-845 (2006). 

8 This is weird since normally we take Fourier transforms on fields rather than expressions 
involving intensity! 
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Carrying out the Fourier transform then gives 

oo / oo \ oo f oo A 

f \-J= f n^e- iw ' T dti>'\e iojr dT+ f \-J= f I(ti>')e iw ' r dti)'\e i0JT dT 

— oo V — oo / — oo V — oo / 

which we rearrange to 



dj 



da>' 



—CXD V —CXD / — CXD V —CXD 

From (0.52) we note that the terms in parentheses are delta functions, so we have 

CXD CXD 

J I{co')5{a)' -a) do)' + J I{w')8{a)' + o)d(ji)' 

—CXD —CXD 

The remaining frequency integrals can then be easily performed to obtain our final 
form: 

( CXD ^ 

~~ (8.23) 



&<ZRe I(aj)e~ ia "do) } = V2ttUM + /(-&>)] 



With (8.22) and (8.23) we can write (8.21) as 
^{Sig(T)} 



2n 



= 2S 6((o) + I{cj) + I{-(o) 



(8.24) 



The Fourier transform of the measured signal is seen to contain three terms, one 
of which is the power spectrum I (to) that we are after. Fortunately, when graphed 
as a function of a> (shown in Fig. 8.5), the three terms on the right-hand side 
typically do not overlap. As a reminder, the measured signal as a function of t 
looks something like that in Fig. 8.3. The oscillation frequency of the fringes lies 
in the neighborhood of <y . The procedure to obtain / (ai) is (1) Record Sig (t); (2) 
if desired, normalize by its value at large t; (3) take its Fourier transform; and (4) 
extract the curve at positive frequencies. 

8.5 Young's Two-Slit Setup and Spatial Coherence 

In close analogy with the Michelson interferometer, which is able to investigate 
temporal coherence, a Young's two -slit setup can be used to investigate spatial co- 
herence of quasi-monochromatic light. Thomas Young, who lived nearly a century 
before Michelson, used his two-slit setup for the first conclusive demonstration 
that light propagates as a wave. The Young's double-slit setup and the Michelson 
interferometer have in common that two beams of light travel different paths 
and then interfere. In the Michelson interferometer, one path is delayed with 
respect to the other so that temporal effects can be studied. In the Young's two-slit 
setup, two laterally separate points of the same wave are compared as they are 
sent through two slits. 



2E tM 8(w) 



-W 





(1) 



Figure 8.5 A graphical depiction of 
&{Sig(r)}/s/2n. 
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Figure 8.6 A point source produces coherent (locked phases) light. When this light 
which traverses two slits and arrives at a screen it produces a fringe pattern. 

Depending on the coherence of the light entering each slit, the fringe pattern 
observed can exhibit good or poor visibility. Just as the Michelson interferometer 
is sensitive to the spectral content of light, the Young's two-slit setup is sensitive 
to the spatial extent of the light source illuminating the two slits. For example, if 
light from a distant star (restricted by a filter to a narrow spectral range) is used to 
illuminate a double-slit setup, the resulting interference pattern appearing on a 
subsequent screen shows good or poor fringe visibility depending on the angular 
width of the star. Michelson was the first to use this type of setup to measure the 
angular width of stars. 

Light emerging from a single ideal point source has wave fronts that are 
spatially uniform in a lateral sense (see Fig. 8.6). Such wave fronts are said to be 
spatially coherent, even if the temporal coherence is not perfect (i.e. if a range 
of frequencies is present) . When spatially coherent light illuminates a Young's 
two-slit setup, fringes of maximum visibility are seen at a distant screen, meaning 
the fringes vary between a maximum intensity and zero. 

Consider a Young's two-slit setup illuminated by a single point source. We 
represent the fields on a subsequent screen that transmit through each slit, re- 
spectively, as E e l(fc " 1_ft,f) and E g*( fcd 2-^t)_ We have assumed that the slits are 
equidistant from the point source and that the two fields at the screen are identi- 
cal other than for their phases. In close analogy with (8.1), the resulting intensity 
pattern on a far-away screen is 

I tot (h) = 2I [l + cos(kd 2 -kd 1 )]=2I [l + cos (khy/D)] (8.25) 

Notice the close similarity between this expression and the output from a Michel- 
son interferometer for a plane wave (8.1). We will consider h (the separation of 
the slits) to be the counterpart of t (the delay introduced by moving a mirror in 
the Michelson interferometer). To obtain the final expression in (8.25) we made 
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use of the following Taylor expansions: 



di (y) = yJ{y-h/2) 2 + D 2 = dJi+ ^ ^ = l) 



D 2 



and 



d 2 {y) = y{y+h/2y + D 2 = D\ 



D 2 



, , [y-hi2f 

2D 2 



, , (y+fe/2r , 

2D 2 



(8.26) 



(8.27) 



These approximations are valid as long as D » y and D » fa. 

We next consider how to modify (8.25) so that it applies to the case when 
the two slits are illuminated by a collection of point sources distributed over a 
finite lateral extent. This situation is depicted in Fig. 8.7 and it leads to partial 
spatial coherence if the phase of each point emitter fluctuates randomly. 9 When 
a Young's two-slit setup is illuminated by an extended random source, the wave 
fronts at the two slits are less correlated. This makes the fringes move around on 
the screen rapidly and partially 'wash out' when time averaged, meaning worse 
fringe visibility. 

To simplify our analysis, we restrict the distribution of point sources to vary 
only in the y' dimension. 10 We assume that the light is quasi-monochromatic so 
that its frequency is approximately to with a phase that fluctuates randomly over 
time intervals much longer than the period of oscillation 2n/o). n 

The light emerging from the y' th point at j/- travels by means of two very 
narrow slits to a point y on a screen. Let Ei(y'.) and E2(yp be the fields on the 



9 A laser beam does not fit the definition of a light source with randomly varying spatial phase. 
Instead, in this section we consider a source such as the surface of a star (filtered to a narrow 
frequency range). See appendix 8.B for more discussion. 
10 The results can be generalized to a two-dimensional source. 

11 Random phase fluctuations necessarily imply some frequency bandwidth, however small. 
Hence the need to specify quasi-monochromatic light. 
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Thomas Young (1773-1829, English) 
was born in Milverton, Somerset, and 
was the oldest of ten children. By age 
fourteen, he had become proficient at a 
dozen different languages. As a young 
adult, he studied medicine and then 
went to Gottingen, Germany where he 
earned a doctoral degree in physics. In 
1801, he was appointed professor of 
natural philosophy at the Royal Insti- 
tute, but he also maintained an active 
medical practice on the side. He con- 
tributed to a wide variety of fields and 
helped to decipher ancient Egyptian hi- 
eroglyphs, including the Rosetta Stone. 
He published descriptions of the heart 
and arteries as well as how the eye ac- 
commodates to see at different depths 
and how the eye perceives color. In en- 
gineering fields, Young is well known 
his analysis of stresses and strains in 
elastic media. Young's double-slit exper- 
iment gave convincing evidence of the 
wave nature of light, overturning New- 
ton's corpusculor theory. Regarding this, 
Thomas Young traded ideas with Au- 
gustin Fresnel through correspondence. 
(Wikipedia) 



screen at y, both originating from the point 3/., but traveling respectively through 
the two different slits. We assume that these fields have the same polarization, 
and we will suppress the vectorial nature of the fields. For simplicity, we assume 
the two fields have the same (real) amplitude at the screen E {y'.). Thus, we write 
the two fields as 



£1 



and 



E 2 {y'j) 



^ (y'.) e i { k [ rAy ? + d2iy) }~ Mt+< f' {y 'j ] } 



(8.28) 



(8.29) 



We have explicitly included an arbitrary phase cpiy'p, which we will take to be 
different for each point source. 

We now set about finding the cumulative field at y arising from the many 
points indexed by the subscript j. The total field on the screen at point y is 



E mt (h) = Y,[E l (y' j ) + E 2 (y' j ) 
j 



(8.30) 



Obviously, in addition to h, the total field depends on y, R, D, and k as well as on 
the phase (piy'j) at each point. Nevertheless, in the end we will mainly emphasize 
the dependence on the slit separation h. The intensity associated with (8.30) is 

/tot(«=^|£tot(«| 2 



£ c 
2 

e c 



Y^E l {y' j ) + E 2 (y' j ) 



Y^E l {y' m )+E 2 {y' m 



£ [Ei {y'j)El {y'j + E 2 {y'j)E* 2 (y' m ) + E l {y))E* 2 (y' m ) + E 2 (yj)^ [y'j 



My)) \E Q {y' m )\ 



+2Re{ e 



it, 



[rilyp-rntfj] ik(ddy)-d 2 (y))\ j A^y))-^)) 



* J,m 



(8.3f) 

At this juncture we make a critical assumption: that the phase of the emission 
(piy'j) varies in time independently at every point on the source. This is sometimes 
called the stochastic assumption, and it is appropriate for the emission from ther- 
mal sources such as starlight, a glowing filament (filtered to a narrow frequency 
range), or spontaneous emission from an excited gas or plasma. However, it is 
not appropriate for coherent sources like lasers. 

A wonderful simplification happens to (8.31) when the phase difference 
< P(y'p ~ ( P^y'm) varies randomly. If j ^ m, then exp{i{(f>{y'p - <p{y' m ))} time-averages 
to zero. On the other hand, if j = m, then the factor reduces to e° = 1. Formally, 
this is written 



t i(<Hy'j)-<pty' m 



>j,m 



_ J 1 if j = m, 
if j ^ m. 



(random phase assumption) (8.32) 
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where 8j >m is known as the Kronecker delta function. The time-averaged intensity 
under the stochastic assumption (8.32) then reduces to 



(I tot (h)) t = £ Ky'j) + £ Uy'j) + 2Re I £ Ky 'j)e 
i i \ J 



! } Mriiy'^-r-Ay))) ik{d l {y)-d 1 (y)) 



(8.33) 

We may use (8.26) to simplify d\ (y) - dziy) = hylD. Very similarly we may also 
write r\ ( j/.) - r2 iy'j) = hy'. I R. The only thing left to do is to put (8.33) into a slightly 



more familiar form: 



a t ot(fc)> t = 

We have introduced 



j 



[l + Re{y(fa)}] (random phase assumption) (8.34) 



.khy . k "y'j 
e-'^ZUy'pe- 1 — 

Yih) = ; (8.35) 

i 

which is known as the degree of coherence. It controls the fringe pattern seen at 
the screen. 

We can generalize (8.34) so that it applies to the case of a continuous distribu- 
tion of light as opposed to a collection of discrete point sources. In Appendix 8.A 
we show how summations in (8.34) and (8.35) become integrals over the source 
intensity distribution, and we write 

(/net {h)) t = 2(/ ones ii t ) r [l + Re{y(/z)}] (random phase assumption) (8.36) 

where 

. khy 00 . khy' 

e~ l ~D- f I(y')e-' — dy' 



T (h) = ^ (8.37) 

/ may' 

-oo 

Here I{y') has units of intensity per length. 

The factor exp {-i khy ID) defines the positions of the periodic fringes on 
the screen. The remainder of (8.37) controls the depth of the fringes as the slit 
separation h is varied. When the slit separation h increases, the amplitude of y {h) 
tends to diminish until the intensity at the screen becomes uniform. When the 

. khy' . . 

two slits have very small separation (such that e 1 « = 1) then we have \j{h)\ = 1 
and very good fringe visibility results, y [h) dictates the degree of spatial coherence 
in much the same way that y (t) dictates the degree of temporal coherence. Notice 
the close similarity between (8.37) and (8.10). 

As the slit separation h increases, the fringe visibility 



(8.38) 
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diminishes, eventually approaching zero (see (8.16)). In analogy to the temporal 
case (see (8.13)), we can define a slit separation sufficiently large to make the 
fringes at the screen 'wash out': 

oo 

h c = 2j\y{h)\ 2 dh (8.39) 
o 

Appendix 8.A Spatial Coherence for a Continuous Source 

In this appendix we examine the spatial coherence of light from a continuous spa- 
tial distribution (as opposed to a collection of discrete point sources) and justify 
(8.37) and (8.38). We begin by replacing the summations in (8.31) with integrals 
over a continuous emission source. We make the following replacements: 

oo oo 

£>(y;.)- f £i(yW and E £ i^m) - f My")dy" 

(8.40) 

oo oo v ' 

ZMy'j)- [ E 2 (y')dy' and Y. E ^y' m ) - [ My")dy" 

j J m J 

J -oo -oo 

Rather than deal with a time average of randomly varying phases, we will instead 
work with a linear superposition of all conceivable phase factors. That is, we will 
write the phase </>(/) as Ky' , where Kisa parameter with units of inverse length, 
which we allow to take on all possible real values with uniform likelihood. The 
way we modify (8.32) for the continuous case is then 

oo 

^iy'^y'j}^ =S]m ^_L j e ^y-y") dK = 8{y"- y >) (8.41) 

f -oo 

With the replacements in (8.40) and (8.41), (8.31) becomes 

oo oo 

I tot {h) = € -f f dy'\E(y')\ f dy"|£(y' ) |[e £fc ^ (j '' ) -'- l(j '" ) ) + e '' fc ^^-^ ( ^) 

-oo -oo 

+2Re{e ! ' fc ^W- d2( >' ) ^ i ' fc ('' 1 ^ ) -'' 2( ^)}]5(y"-/) 

(8.42) 

Again, consistent with (8.26), we may write d\{y) - d 2 {y) = hylD and r\[y') - 
r 2 {y') = hy'lR, and (8.42) reduces to 

oo ( oo 1 

/I -khy f , . khy' 

I{y')dy' + 2Re < e~ 1 ^ / l{y')e~ l ~ dy '\ (8.43) 

-oo I. -oo J 

where 

I{y') = ^e c\E{y')\ 2 (8.44) 
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For 7 tot to have normal units of intensity, I{y') must have units of intensity per 
length of source, implying that E{y') has units of field per square root of length. 
Hence, I{y')dy' is the intensity at the screen caused by the entire extended 
source when only one slit is open. We see that (8.43) is equivalent to (8.36) and 
(8.37). 

Appendix 8.B Van Cittert-Zernike Theorem 

In this appendix we avoid making the assumption of randomly varying phase. 
This would be the case when the source of light is, for example, a laser. In place of 
(8.42) we have 



Itotih) cx 



2R 



J \\E(y')\e i 'f'W +i 

■oo 

( oo 
■khy \ r f 

2Ree'TT^ I \\b{/)\ 



' 2 1 _ . khy' 

e 1 2R dy 



/\ kv 
|£(/)|e'W + ''^ 



fl 1 . khy' 

e l 2ii dy 



■ k, i-, ■ ky"- 



j khy' 

e 2R 



dy'llf \\E{y') 



. khy' 

e'^irdy' \ 

(8.45) 



where we have employed (8.26) and (8.27) and similar expressions involving R 
and y' . 

The first term on the right-hand side of (8.45) is the intensity on the screen 
when the lower slit is covered. The second term is the intensity on the screen 
when the upper slit is covered. The last term is the interference term, which 
modifies the sum of the individual intensities when both slits are uncovered. 

Notice the occurrence of Fourier transforms (over position) on the quantities 
inside of the square brackets. Later, when we study diffraction theory, we will 
recognize these transforms as determining the strength of fields impinging on 
the individual slits. This corresponds to a major difference between a spatially 
coherent source and a random-phase source. With the random-phase source, the 
slits are always illuminated with the same strength regardless of the separation. 
However, with a coherent source, 'beaming' can occur such that the strength as 
well as phase of the field at each slit depends on the slit separation. 

A beautiful simplification occurs when the phase of the emitted light has the 
following distribution: 

(p{y) = (8.46) (converging spherical wave) 

27? 

Equation (8.46) is not as arbitrary as it may first appear. This particular phase 
is an approximation to a concave spherical wave front converging to the center 
between the two slits. This type of wave front is created when a plane wave passes 
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(converging spherical wave) 



(converging spherical wave 
assumption) 



through a lens. With the special phase (8.46), the intensity (8.45) reduces to 



J tot [h) oc 



. khy' 



oo 

/ \E{y')\e^dy' 



j \E{y')\e~ 1 '^- dy' 

-oo 

■ khy I C , . khy' 

+ 2Ree ! ~5~ < / \E(y')\e~ l ^<~ dy' 

l-oo 

There is a close resemblance between the expression in the first term 

oo 

/. , -khy' 
\E{y')\e~ l ^R dy' 



(8.47) 



(8.48) 



and the magnitude of the degree of coherence V = |y (fa/2)| from (8.37). Again, 
this corresponds to the field that goes through the upper slit, when it is positioned 
at h/2, and which impinges on the screen. Let this field be denoted by \E\ [hl2)\. 
The field strength when the single slit is positioned at h compared to that when it 
is positioned at zero is 



Ei th) 



Ei (0) 



/ \E(y')\e~ l ~dy' 



I \E{y')\dy> 



(8.49) 



This looks very much like \y{h) \ of (8.37) except that the magnitude of the field 
appears in (8.49), whereas the intensity appears in (8.37). 

This may seem rather contrived, but at least it is cute, and it is known as the 
van Cittert-Zernike theorem. 12 It says that the spatial coherence of an extended 
source with randomly varying phase drops off with lateral slit separation in the 
same way that the field pattern at the focus of a converging spherical wave would 
drop off, whose field amplitude distribution is the same as the original intensity 
distribution. 



12 



M. Born and E. Wolf, Principles of Optics, 7th ed., p. 574 (Cambridge University Press, 1999). 
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Exercises for 8.2 Coherence Time and Fringe Visibility 

P8.1 (a) Verify that (8.16) gives the fringe visibility. 

HINT: Write y = \y\ e 1 ^ and assume that the oscillations in y that give 
rise to fringes are due entirely to changes in <p and that \y\ is a slowly 
varying function in comparison to the oscillations. 

(b) What is the coherence time t c of the light in P8.4? 

P8.2 (a) Show that the fringe visibility of a Gaussian spectral distribution 
(see Example 8.2) goes from 1 to e~ nl2 = 0.21 as the round- trip path in 
one arm of the instrument is extended by a coherence length. 

(b) Find the FWHM bandwidth in wavelength AA FWHM in terms of the 
coherence length £ c and the center wavelength A . 

HINT: First determine Aw FWHM > defined to be the width of 1(a)) at half 
of its peak. To convert to a wavelength difference, use a> = => 
Aw FW HM = — -pfAApwHM- You can ignore the minus sign; it simply 
means that wavelength decreases as frequency increases. 



Exercises for 8.3 Temporal Coherence of Continuous Sources 

P8.3 Show that Re{y(r)} defined in (8.10) reduces to cos {(o t) in the case of 
a plane wave E(t) = E Q e l(kaZ ~ <x>at) being sent through a Michelson inter- 
ferometer. In other words, the output intensity from the interferometer 
reduces to 

/ = 2/ [1 + COS (to T)] 

as you already expect. 

HINT: Don't be afraid of delta functions. After integration, the left-over 
delta functions cancel. 

P8.4 Light emerging from a dense hot gas has a collisionally broadened 
power spectrum described by the Lorentzian function 

7 ^) = — ; V2 

i + [ "-up I 

V Awfwhm/2 j 

The light is sent into a Michelson interferometer. Make a graph of the 
average power arriving to the detector as a function of t. 

HINT: See (0.56). 
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P8.5 Consider the light source described in P8.4 

(a) Regardless of how the phase of E{a)) is organized, the oscillation 
of the energy arriving to the detector as a function of t is the same. 
The spectral phase of the light in P8.4 is randomly organized. Describe 
qualitatively how the light probably looks as a function of time. 

(b) Now suppose that the phase of the light is somehow neatly orga- 
nized such that 



Em 



Aujfwhm/2 

Perform the inverse Fourier transform on the field and find how the 
intensity of the light looks a function of time. 

HINT: 

oo 

[ e~ iax J f -2ine ia P ifa>0 . a . 

J x + p 1 ifa<0 1 F ' 

-oo 

The constants I{(o ), and A<y FWHM will appear in the answer. 
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Detector 



Beam 
Splitter 



Figure 8.8 



L8.6 (a) Use a scanning Michelson interferometer to measure the wave- 
length of the ultrashort laser pulses from a mode-locked Ti:sapphire 
oscillator. 13 

(b) Measure the coherence length of the source by observing the dis- 
tance over which the visibility diminishes. From your measurement, 
what is the bandwidth AAfwhm of the source, assuming the Gaussian 
profile in the previous problem? See P8.2. 

(c) Use a computer to perform a fast Fourier transform (FFT) of the 
signal output. For the positive frequencies, plot the laser spectrum as a 
function of A and compare with the results of (a) and (b). 

(d) How do the results change if the ultrashort pulses are first stretched 
in time by traversing a thick piece of glass? 



Exercises for 8.5 Young's Two-Slit Setup and Spatial Coherence 

P8.7 (a) A point source with wavelength A = 500 nm illuminates two parallel 
slits separated by h = 1.0 mm. If the screen is D = 2 m away, what is 
the separation between the diffraction peaks on the screen? Make a 
sketch. 



13 J. Peatross and S. Bergeson, "Fourier Spectroscopy of Ultrashort Laser Pulses," Am. J. Phys. 74, 
842-845 (2006). 
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(b) A thin piece of glass with thickness d = 0.01 mm and index n = 1.5 is 
placed in front of one of the slits. By how many fringes does the pattern 
at the screen move? 

HINT: Add A(/> to k (d 2 - d\) in (8.25) , where A</> = (p 2 -<p\ is the relative 
phase between the two paths. Compare the phase of the light when 
traversing the glass versus traversing an empty region of the same 
thickness. 

L8.8 (a) Carefully measure the separation of a double slit in the lab [h ~ 
0.1 mm separation) by shining a HeNe laser (A = 633 nm) through it 
and measuring the diffraction peak separations on a distant wall (say, 
2 m from the slits) . 

HINT: For better accuracy, measure across several fringes and divide. 

Filter 

Oia 

CCD 
Camera 

Rotating diffuser 
to create phase 
variation 

Figure 8.9 

(b) Create an extended light source with a HeNe laser using a time- 
varying diffuser followed by an adjustable single slit. (The diffuser 
must rotate rapidly to create random time variation of the phase at 
each point as would occur automatically for a natural source such 
as a star.) Place the double slit at a distance of R « 100 cm after the 
first slit. (Take note of the exact value of R, as you will need it for the 
next problem.) Use a lens to image the diffraction pattern that would 
have appeared on a far-away screen into a video camera. Observe 
the visibility of the fringes. Adjust the width of the source with the 
single slit until the visibility of the fringes disappears. After making the 
source wide enough to cause the fringe pattern to degrade, measure 
the single slit width a by shining a HeNe laser through it and observing 
the diffraction pattern on the distant wall, (video) 

HINT: As we will study later, a single slit of width a produces an inten- 
sity pattern on a screen a distance L away described by 

2 ilia \ 

7(x) = / peak sinc (— xj 
where sine (a) = stes and lim stes = L 

a a ^ Q a 



Double slit 



Diffuser Sln f e sllt separation h 




R- 100 cm 
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NOTE: It would have been nicer to vary the separation of the two slits 
to determine the width of a fixed source. However, because it is hard to 
make an adjustable double slit, we varied the size of the source until 
the spatial coherence of the light matched the slit separation. 

P8.9 (a) Compute h c for a uniform intensity distribution of width a using 
(8.39). 

(b) Use this formula to check that your measurements in L8.8 agree 
with spatial coherence theory. 

HINT: In your experiment h c is the double slit separation. Use your 
measured R and h to calculate what the width of the single slit (i.e. 
a) should have been when the fringes disappeared and compare this 
calculation to your direct measurement of a. 



Solution: (This is only a partial solution) 



j(h)-- 



f I exp\-ikh[L + l)]dy' e-' kh T> f e'^ R dy' 
-a/2 ' ' : 



a/2 

/ ■ 

-a/2 





-ikh\ 


all 


e - ikh T) 




; kh 






' R 


-all 



-ikh4 



all 

I hdy' 

-all 

-ikh^ -ikh- 

? « — e 



-2ikh 



all 



-ikhf, . kha 

- e Dsinc 

2R 



Note that 



/ 



sin ax n 

(ax) 2 ~ 2a 
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True and False Questions 

R29 T or F: In our notation (widely used), I{t) is the Fourier transform of 
I (to). 

R30 T or F: The integral of I(t) over all t equals the integral of / ((d) over all 

(D. 

R31 T or F: The phase velocity of light (the speed of an individual frequency 
component of the field) never exceeds the speed of light c. 

R32 T or F: The group velocity of light in a homogeneous material can 
exceed c if absorption or amplification takes place. 

R33 T or F: The group velocity of light never exceeds the phase velocity. 

R34 T or F: A Michelson interferometer can be used to measure the spectral 
intensity of light / (cm) . 

R35 T or F: A Michelson interferometer can be used to measure the duration 
of a short laser pulse and thereby characterize its chirp. 

R36 T or F: A Michelson interferometer can be used to measure the wave- 
length of light. 

R37 T or F: A Michelson interferometer can be used to measure the phase 

Of E {(D). 

R38 T or F: The Fourier transform (or inverse Fourier transform if you prefer) 
of / [cm) is proportional to the degree of temporal coherence. 

R39 T or F: A Michelson interferometer is ideal for measuring the spatial 
coherence of light. 

R40 T or F: The Young's two-slit setup is ideal for measuring the temporal 
coherence of light. 
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R41 T or F: Vertically polarized light illuminates a Young's double-slit setup 
and fringes are seen on a distant screen with good visibility. A half wave 
plate is placed in front of one of the slits so that the polarization for that 
slit becomes horizontally polarized. Here's the statement: The fringes 
at the screen will shift position but maintain their good visibility. 



Horizontal 
Polarizer 



Vertical 
Polarizer 











1 











1 


' 




' 
















1 



Figure 8.10 




Figure 8.1 1 Polarizing Elements 



Problems 
R42 



R43 



R44 



(a) Horizontally polarized light enters a system and first travels through 
a horizontal and then a vertical polarizer in series. What is the Jones 
vector of the transmitted field? 

(b) Now a polarizer at 45° is inserted between the two polarizers in the 
system described in (a) . What is the Jones vector of the transmitted 
field? How does the final intensity compare to initial intensity? 

(c) Now a quarter wave plate with a fast-axis angle at 45° is inserted 
between the two polarizers (instead of the polarizer of part (b)). What 
is the Jones vector of the transmitted field? How does the final intensity 
compare to initial intensity? 

(a) Find the Jones matrix for half wave plate with its fast axis making an 
arbitrary angle 9 with the x-axis. 

HINT: Project an arbitrary polarization with E x and E y onto the fast 
and slow axes of the wave plate. Shift the slow axis phase by n, and then 
project the field components back onto the horizontal and vertical axes. 
The answer is 

cos 2 9 - sin 2 9 2 sin 9 cos 9 
2 sin cos f? sin 2 0- cos 2 f? 

(b) We desire to create a variable attenuator for a polarized laser beam 
using a half wave plate and a polarizer aligned to the initial polarization 
of the beam (see figure). The fast axis of the half wave plate is initially 
aligned in the direction of polarization and then rotated through an 
angle 9. What is the ratio of the intensity exiting the polarizer to the 
incoming intensity as a function of 0? 



(a) What is the spectral content (i.e., / (o>)) of a square laser pulse 
E{t) 



E e- la>ot 




in <t/2 
m >t/2 



Make a sketch of I[a)), indicating the location of the first zeros. 

(b) What is the temporal shape (i.e., 7(f)) of a light pulse with frequency 



content 



E{co) 



E , \d) -a> \ < Aa)/2 
, \(o-(o \ > Ao)/2 
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where in this case E has units of E-field per frequency. Make a sketch 
of I(t), indicating the location of the first zeros. 

(c) If E {(jS) is known (any arbitrary function, not the same as above), and 
the light goes through a material of thickness £ and index of refraction 
n{wi), how would you find the form of the pulse E{t) after passing 
through the material? Please set up the integral. 

R45 (a) Prove Parseval's theorem: 

oo oo 

J \E{co)\ 2 dco= J \E{t)\ 2 dt. 

-oo -oo 

HINT: 



(b) Explain the physical relevance of Parseval's theorem to light pulses. 
Suppose that you have a detector that measures the total energy in 
a pulse of light, say 1 mj directed onto an area of 1 mm 2 . Next you 
measure the spectrum of light and find it to have a width of AA = 
50 nm, centered at A = 800 nm. Assume that the light has a Gaussian 
frequency profile 

I {a)) = I{a) )e I «<» > 

Use as an approximate value 8a) = ^jf AA. Find a value and correct 
units for / (cd ) . 



HINT: 



oo 

/ 



-A* + Bx + C dx= X e B>l4A + C Re{A}>0 

V A 



R46 Continuous light entering a Michelson interferometer has a spectrum 
described by 

. . -„ . CD (O < \o>/2 

1 {(!)) 



\ , 



\a>-(o \ > Aw/2 



The Michelson interferometer uses a 50:50 beam splitter. The emerging 
light has intensity (/^et (f, t)> t = 2 (I(t)) t [ 1 + Rey (t)] , where degree of 
coherence is 

OO / oo 

y( T )= J H(0) e~ ibJT d(0 / J I{(D)d(D 

-oo ' -oo 

Find the fringe visibility V = (/ max - 7 min ) / (7 max + 7 min ) as a function of t 
(i.e. the round-trip delay due to moving one of the mirrors). 
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R47 Light emerging from a point travels by means of two very narrow slits 
to a point y on a screen. The intensity at the screen arising from a point 
source at position y' is found to be 



/screen 



[y',h] =2/(/) jl + cos 



I y y'\ 

kh - + — 
\D R 



where an approximation has restricted us to small angles. 

(a) Now, suppose that I[y') characterizes emission from a wider source 
with randomly varying phase across its width. Write down an expres- 
sion (in integral form) for the resulting intensity at the screen: 



(b) Assume that the source has an emission distribution with the form 
I(y') = [l /Ay') e -/ 2/A / 2 . What is the function y(/z) where the intensity 
is written /screen (h) = 2s/nI Q [l + Rey(ft)]? 
HINT: 



/ 



- Ax2+Bx+c dx=J-e B2 ' iA+c Re{A}>0. 



(c) As h varies, the intensity at a point on the screen y oscillates. As h 
grows wider, the amplitude of oscillations decreases. How wide must 
the slit separation h become (in terms of R, k, and Ay') to reduce the 
visibility to 

yr ^max -/min 1 

/max /min 3 

Selected Answers 

R42: (b) 1/4, (c) 1/2. 

R45: (b) 3.8 x 10 _16 J/(cm 2 • s" 1 ). 



Chapter 9 

Light as Rays 



So far in our study of optics, we have described light in terms of waves, which sat- 
isfy Maxwell's equations. However, as you are probably aware, in many situations 
light can be thought of as rays pointing along the direction of wave propagation. A 
ray picture is useful when one is interested in the macroscopic flow of light energy, 
but rays fail to reveal fine details, in particular wave and diffraction phenomena. 
For example, simple ray theory suggests that a lens can focus light down to a point. 
However, if a beam of light were concentrated onto a true point, the intensity 
would be infinite! Nevertheless, ray theory is useful for predicting where a focus 
occurs. It is also useful for describing imaging properties of optical systems (e.g. 
lenses and mirrors) . 

Beginning in section 9.3 we study the details of ray theory and the imaging 
properties of optical systems. First, however, we examine the justification for ray 
theory starting from Maxwell's equations. In the short-wavelength limit, Maxwell's 
equations give rise to the eikonal equation, which governs the direction of rays 
in a medium with an index of refraction that varies with position. The German 
word 'eikonal' comes from the Greek 'eixtov' from which the modern word 'icon' 
derives. The eikonal equation therefore has a descriptive title since it controls the 
formation of images. Although we will not use the eikonal equation extensively, 
we will show how it embodies the underlying justification for ray theory. As will be 
apparent in its derivation, the eikonal equation relies on an approximation that 
the features of interest in the light distribution are large relative to the wavelength 
of the light. 

The eikonal equation describes the direction of ray propagation, even in com- 
plicated situations such as desert mirages where air is heated near the ground and 
has a different index than the air farther from the ground. Rays of light from the 
sky that initially are directed toward the ground can be bent such that they travel 
parallel to or even up from the ground, owing to the inhomogeneous refractive 
index. The eikonal equation can also be used to deduce Fermat's principle, which 
in short says that light travels from point A to point B following a path that takes 
the minimum time. This principle can be used, for example, to'derive' Snell's law. 
Of course Fermat asserted his principle more than a century before Maxwell's 
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equations were known, but it is nice to give justification retroactively to Fermat's 
principle using the modern perspective. 

In this chapter, we will analyze the propagation of rays through optical systems 
composed of lenses and/or curved mirrors in the context of paraxial ray theory. 
The paraxial approximation restricts rays to travel nearly parallel to the axis of 
such systems. We consider the effects of three basic optical elements acting on 
paraxial rays: 1) Unobstructed propagation through a distance d in a uniform 
medium; a ray may move farther away from (or closer to) the optical axis, as it 
travels. 2) Reflection from a curved spherical mirror, which changes a ray's angle 
with respect to the optical axis. 3) Transmission through a spherical interface 
between two materials with differing refractive indices. The effects of each of 
these basic elements on a ray of light can be represented as a 2 x 2 matrix, which 
can be multiplied together to construct more complex imaging systems (such as 
a lens or a series of lenses and curved mirrors). 

We will study image formation in the context of the paraxial approximation, 
which in the case of a curved mirror or a thin lens gives rise to the familiar formula 

1 1 1 

~7 = ~T + ~J 0-1) 

Even a complicated multi-element optical system obeys (9.1) if d and d\ are 
measured from principal planes rather than the single plane of, for example, a 
thin lens. 

Paraxial ray theory can also be used to study the stability of laser cavities. The 
formalism predicts whether a ray after many round trips in the cavity remains 
near the optical axis (trapped and therefore stable) or if it drifts endlessly away 
from the axis of the cavity on successive round trips. 

In appendix 9.A we address deviations from the paraxial ray theory known 
as aberrations. We also comment on ray-tracing techniques, used for designing 
optical systems that minimize such aberrations. 



9. 1 The Eikonal Equation 

In this chapter, we consider light to consist of only a single frequency to. The wave 
equation (2.13) for a medium with a real index of refraction in this case may be 
written as 

, n 2 (r) a) 2 

V 2 E(r, t) + — -i — E (r, t) = (9.2) 

c l 

where we have already performed the time differentiation on the assumed time 
dependence e~ iait . Although in chapter 2 we considered solutions to the wave 
equation in a homogeneous material, the wave equation remains valid when the 
index of refraction varies throughout space (i.e. if n (r) is an arbitrary function 
of r). In this case, the usual plane-wave solutions no longer satisfy the wave 
equation. 



9.1 The Eikonal Equation 
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As a trial solution for (9.2), we take 

E(r, f)=E„ (r) e'*™*«-«>a 



where 



a) 2n 



kvac — — 



(9.3) 



(9.4) 



Here R (r) is a real scalar function (which depends on position) having the dimen- 
sion of length. By taking R (r) to be real, we do not take into account possible 
absorption or amplification in the medium. Even though the trial solution (9.3) 
looks somewhat like a plane wave, 1 the function R (r) accommodates wave fronts 
that can be curved or distorted as depicted in Fig. 9.1. At any given instant t, the 
phase of the curved surfaces described by R (r) = constant can be interpreted 
as wave fronts of the solution. The wave fronts travel in the direction for which 
R (r) varies the fastest. This direction is is aligned with Vi? (r), which lies in the 
direction perpendicular to surfaces of constant phase. 

The substitution of the trial solution (9.3) into the wave equation (9.2) gives 



1 

l 

k 2 

"■vac 



E (r) e 



ik vac R(r) 



+ n l (r)E (r)e 



ik vac R(r) 







(9.5) 



where we have divided each term by e 



-itot 



Computing the Laplacian in (9.5) 

The gradient of the x component of the field is 

V \e ox (r) e ik ™ m ] = [VE ax (r)] e ik ™ m + ik mc E ox (r) [VR (r)] e ik ~> BM 
The Laplacian of the x component is 

V ■ V \e ox (r) e ik ~ m ] = {V 2 £ 0X (r) - k^ c E ox (r) [VR (r)] • [VR (r)] 

+ ik vac E ox (r) [V 2 i? (r)] + 2ik vac [VE 0X (r)] • [VR (r)]} e 



ifcvac-R(r) 



Upon combining the result for each vector component of E (r) , the required spatial 
derivative can be written as 



V z E„ (r)e 



ifcvac-R(r) 



= (V 2 E (r) - fc 2 ac E (r) [VR (r)] • [VR (r)] + /fc vac E (r) [V 2 R (r)] 
+2ik vac {±[VE 0X (r)] • [Vi?(r)] +y[V£ oy (r)] • [VR{r)] 
+ z[VE oz {r)}-[VR{r)}})e ik ™ mr) 



the index is spatially independent (i.e. n (r) — > n), then (9.3) reduces to the usual plane-wave 
solution of the wave equation. In this case, we have R (r) = k ■ r/ fc V ac and the field amplitude 
becomes constant (i.e. E (r) -» E ). 




K(r) = c 3 



R{r) = c t 



Figure 9.1 Wave fronts (i.e. sur- 
faces of constant phase given by 
R{r)) distributed throughout space 
in the presence of a spatially inho- 
mogeneous refractive index. The 
gradient of R gives the direction of 
travel for a wavefront. 
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After performing the Laplacian and after some rearranging, (9.5) becomes 



h 

h/2- 
h/4 




Figure 9.2 Depiction of possible 
light ray paths in a region with 
varying index. 



[Vfl(r)-V7?(r)-n 2 (r)]E (r) 



V 2 E (r) i 2 



2 1 



+ - — V z i? (r) + - — xVE ox (r) • VR (r) 



vac 

2i 



k 



fc 



+ 7— [y ( V£ oy M] • Vi? (r) + zV£ oz (r) • VR (r)] 



(9.6) 

Don't be afraid; at this point we are ready to make an important approxima- 
tion. We take the limit of a very short wavelength (i.e. l/fc vac = X vac /2n — 0), and 
the entire right-hand side of (9.6) vanishes. (Thank goodness!) With it we lose 
the effects of diffraction. We also lose surface reflections at abrupt index changes 
unless specifically considered. This approximation works best in situations where 
only macroscopic features are of concern. 

Our wave equation has been simplified to 



[Vfl(r)]-[Vfl(r)] = rc z (r) 
Written another way, this equation is 

VR (r) = n (r) s (r) 



(9.7) 



(9.8) 



where s is a unit vector pointing in the direction VR (r), the direction normal to 
wave front surfaces. Equation (9.8) is called the eikonal equation? 

Example 9.1 

Suppose that a region of air above the desert on a hot day has an index of refraction 
that varies with height y according to n [y] - no yl + y 2 lh 2 . Verify that R [x, y) = 
no (x + y 2 l2h) is a solution to the eikonal equation. (See problem P9.1 for a more 
general solution.) 

Solution: The gradient of our trial solution gives 

Vi? [x, y) = no [x+yy/h] 

Substituting this into (9.7) gives 

VJ?-Vi?= n [x+yy/h)-n (x + yylh) = n\{\ + y l lh 2 ) = n 2 [y] 

which confirms that it is a solution. The direction of light propagation is 

, , Vi? no{x+yylh) x + yy/h 

s V = = - = — ^=^= 

K 1 IVfll n yjl + y 2 /h 2 y/l + f 

Computed at various heights, the direction for rays turns out to be 



• (/») = 



x + y 
V2 



s(h/2) = 



x + y/2 
V5l4 



s{h/4) = 



x±y/4 
vT7/16 



- M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 3.1.1 (Cambridge University Press, 1999). 



9.2 Fermat's Principle 



231 



These are represented in Fig. 9.2. In a desert mirage, light from the sky can appear 
to come from a lower position. We can determine a path for the rays by setting 
dyldx equal to the slope of s: 



dy _ y 

dx h 



y = yoe 



-{x-xa)lh 



Under the assumption of an infinitely short wavelength, the Poynting vector 
is directed along s as demonstrated in P9.2. In other words, the direction of 
s specifies the direction of energy flow. The unit vector s at each location in 
space points perpendicular to the wave fronts and indicates the direction that the 
waves travel as seen in Fig. 9.1. We refer to a collection of vectors s distributed 
throughout space as rays. 

In retrospect, we might have jumped straight to (9.8) without going through 
the above derivation. After all, we know that each part of a wave front advances 
in the direction of its gradient MR (r) (i.e. in the direction that R (r) varies most 
rapidly). We also know that each part of a wave front defined by R (r) = constant 
travels at speed c/n[r). The slower a given part of the wave front advances, the 
more rapidly R (r) changes with position r and the closer the contours of constant 
phase. It follows that VR (r) must be proportional to n (r) since VR (r) denotes the 
rate of change in R (r) . 

9.2 Fermat's Principle 

As we have seen, the eikonal equation (9.8) governs the path that rays follow as 
they traverse a region of space, where the index varies with position. Another way 
of deducing the correct path of rays is via Fermat's principle. 3 Fermat's principle 
says that if a ray happens to travel through both points A and B, it will follow a 
path between them that takes the least time. 



Derivation of Fermat's Principle from the Eikonal Equation 

We begin by taking the curl of (9.8) to obtain 4 

Vx[/i (r) s(r)] = V x [VR (r)] = 
This can be integrated over an open surface of area A to give 



Jvx [n(r)i(t)]da = jn(r)s(r)-d£ = 

A C 



(9.9) 



(9.10) 



where we have applied Stokes' theorem (0.12) to convert the area integral into a 
path integral around the perimeter contour C. 




Pierre de Fermat (1601-1665, French) 
was born in Beaumont-de-Lomagne, 
France to a wealthy merchant family. 
He attended the University of Toulouse 
before moving to Bordeaux in the late 
1620s where Fermat distinguished him- 
self as a mathematician. Fermat was 
proficient in many languages and went 
on to obtain a law degree in 1631 from 
the University of Orleans. He continued 
his study of mathematics as a hobby 
throughout his life. He corresponded 
with a number of notable mathemati- 
cians, and through his letters made 
notable contributions to analytic ge- 
ometry, probability theory, and number 
theory. He was often quite secretive 
about the methods used to obtain his 
results. Mathematicians suspect that 
Fermat didn't actually prove his famous 
last theorem, which was not able to be 
verified until the 1990's. Fermat was 
the first to assert that the path taken 
by a beam of light is the one that can 
be traveled in the least amount of time. 
(Wikipedia) 



3 M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 3.3.2 (Cambridge University Press, 1999). 
4 The curl of a gradient is identically zero for any function. 
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Figure 9.3 A ray of light leaving 
point A arriving at B. 



Equation (9.10) states that the integration of ns- d( around a closed loop is always 
zero. If we consider a closed loop comprised of a path from point A to point B and 
then a different path from point back to point A again, the integrals for the two 
legs always cancel, even while holding one path fixed while varying the other. This 
means 



/ 



ns- 



is independent of path from A to B. 



(9.11) 



Now consider a path from A to B that is parallel to s, as depicted in Fig. 9.3. In 
this case, the cosine in the dot product is always one. If we choose some other 
path that connects A and B, the cosine associated with the dot product is less than 
one at most points along that path, whereas the result of the integral is the same. 
Therefore, if we artificially remove the dot product from the integral (i.e. exclude 
the cosine factor), the result of the integral will exceed the true value unless the 
path chosen follows the direction of s (i.e. the path that corresponds to the one 
that light rays actually follow). 

In mathematical form, this argument can be expressed as 



/ 



ns- di - min 



f/ 

I A 



nd(\ 



(9.12) 



The integral on the right is called the optical path length (OPL) between points A 
andB: 



OPL\ 



■ J ndt 



(9.13) 



The conclusion is that the true path that light follows between two points (i.e. 
the one that stays parallel to s) is the one with the shortest optical path length. 
The index n may vary with position and therefore can be different for each of the 
incremental distances d(. 



Fermat's principle is usually stated in terms of the time it takes light to travel 
between points. The travel time Af depends not only on the path taken by the 
light but also on the velocity of the light v (r), which varies spatially with the 
refractive index: 



_tdt__r_dt__ 

A J v(r) J c/n{r) 



OPL\ 



(9.14) 



To find the correct path for the light ray that leaves point A and crosses point 
B, we need only minimize the optical path length between the two points. Mini- 
mizing the optical path length is equivalent to minimizing the time of travel since 
it differs from the time of travel only by the constant c. The optical path length 
is not the actual distance that the light travels; it is proportional to the number 
of wavelengths that fit into that distance (see (2.24)). Thus, as the wavelength 
shortens due to a higher index of refraction, the optical path length increases. 
The correct ray traveling from A to B does not necessarily follow a straight line 
but can follow a complicated curve according to how the index varies. 



9.2 Fermat's Principle 
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An imaging situation occurs when many paths from point A to point B have 
the same optical path length. An example of this occurs when a lens causes an 
image to form. In this case all rays leaving point A (on an object) and traveling 
through the system to point B (on the image) experience equal optical path 
lengths. This situation is depicted in Fig. 9.4. Note that while the rays traveling 
through the center of the lens have a shorter geometric path length, they travel 
through more material so that the optical path length is the same for all rays. 

To summarize Fermat's principle, of the many rays that might emanate from 
a point A, the ray that crosses a second point B is the one that follows the shortest 
optical path length. If many rays tie for having the shortest optical path, we 
say that an image of point A forms at point B. It should be noted that Fermat's 
principle, as we have written it, does not work for anisotropic media such as 
crystals where n depends on the direction of a ray as well as on its location (see 
P9.4). 




Figure 9.4 Rays of light leaving 
point A with the same optical path 
length to B. 



Example 9.2 

Use Fermat's principle to derive Snell's law. 

Solution: Consider the many rays of light that leave point A seen in Fig. 9.5. Only 
one of the rays passes through point B. Within each medium we expect the light to 
travel in a straight line since the index is uniform. However, at the boundary we 
must allow for bending since the index changes. 

The optical path length between points A and B may be written 



OPL=niJ xf + yf + n t \J xf + y[ 



(9.15) 



We need to minimize this optical path length to find the correct one according to 
Fermat's principle. 

Since points A and B are fixed, we may regard x\ and x t as constants. The distances 
yi and y t are not constants although the combination 



ytot = yi + yt 

is constant. Thus, we may rewrite (9.15) as 



OPL (yj) = m J xf + yf + n t \J xf + [y tot - y ; ) 



(9.16) 



(9.17) 



where everything on the right-hand side is constant except for yj. 

We now minimize the optical path length by taking the derivative and setting it 
equal to zero: 



d {OPL) 
dyi 



yi 



xf + yf 



: + n t 



(y tot - yi) 



x t + (ytot - yi) 



Notice that 



sint 



xf + yf 



and sin0 t : 



yt 



xf + yf 



(9.18) 



(9.19) 



Xi 


B 




x t 

n t 



Figure 9.5 Rays of light leaving 
point A; not all of them will tra- 
verse point B. 
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When these are substituted into (9.18) we obtain 

n.[Sm8i = n t sm8 t 

which is the familiar Snell's law. 



(9.20) 



(x, y) 




Figure 9.6 



Example 9.3 

Use Fermat's principle to derive the equation of curvature for a reflective surface 
that causes all rays leaving one point to image to another. Do the calculation in 
two dimensions rather than in three. 5 

Solution: We adopt the convention that the origin is half way between the points, 
which are separated by a distance 2a, as shown in Fig. 9.6. If the points are to 
image to each other, Fermat's principle requires that the total path length be a 
constant; call it b. By inspection of the figure, we that path (which reflects once) 
from one point to the other is 



(jc + a) 2 + y 2 + J {x - a) 2 + y 2 = b 



(9.21) 



To get (9.21) into a more recognizable form, we isolate the first square root and 
square both sides of the equation, which gives 



(x + a) 2 + y 2 - b 2 + [x - a) 2 + y 2 - 2by {x - a) 2 + y 2 
After squaring the two binomial terms, some nice cancelations occur, and we get 



4ax - b 2 = -2b\j [x - a) 2 + y 2 

which we square again to obtain 

16a 2 x 2 - 4ab 2 x+ b 4 = 4b 2 [x 2 - 2ax + a 2 + y 2 ) 

After some cancellations and regrouping this becomes 

(16a 2 - 4b 2 ) x 2 - 4b 2 y 2 = 4a 2 b 2 - b 4 

Finally, we divide both sides by the term on the right to obtain the (hopefully) 
familiar form of an ellipse 



y 



(t) (f- 2 ) 



= l 



(9.22) 



5 This configuration is used to direct flash lamp energy into a laser amplifier rod. One 'point' in 
Fig. 9.6 represents the end of an amplifier rod while the other represents the end of a thin flash-lamp 
tube. 
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9.3 Paraxial Rays and ABCD Matrices 

We now turn our attention to the effects of curved mirrors and lenses on rays of 
light. Keep in mind that when describing light as a collection of rays rather than 
as waves, the results can only describe features that are macroscopic compared to 
a wavelength. The rays of light at each location in space describe approximately 
the direction of travel of the wave fronts at that location. Since the wavelength of 
visible light is extraordinarily small compared to the macroscopic features that we 
perceive in our day-to-day world, the ray approximation is often a very good one. 
This is the reason that ray optics was developed long before light was understood 
as a wave. 

We consider ray theory within the paraxial approximation, meaning that 
we restrict our attention to rays that are near and almost parallel to an optical 
axis of a system, say the z-axis. It is within this approximation that the familiar 
imaging properties of lenses occur. An image occurs when all rays from a point 
on an object converge to a corresponding point on what is referred to as the 
image. To the extent that the paraxial approximation is violated, the clarity of 
an image can suffer, and we say that there are aberrations present. The field of 
optical engineering is often concerned with the minimization aberrations in cases 
where the paraxial approximation is not strictly followed. This is done so that, for 
example, a camera can take pictures of objects that occupy a fairly wide angular 
field of view, where rays violate the paraxial approximation. Optical systems are 
typically engineered using the science of ray tracing, which is described briefly in 
section 9. A. 

As we develop paraxial ray theory, we should remember that rays impinging 
on devices such as lenses or curved mirrors should strike the optical component 
at near normal incidence. To quantify this statement, the paraxial approximation 
is valid to the extent that 

sin0 = 9 (9.23) 
is a good approximation, and similarly 

tanfl = 9 (9.24) 



Here, the angle 9 (in radians) represents the angle that a particular ray makes 
with respect to the optical axis. There is an important mathematical reason for 
this approximation. The sine is a nonlinear function, but at small angles it is 
approximately linear and can be represented by its argument. It is this linearity 
that is crucial to the process of forming images. The linearity also greatly simplifies 
the formulation since it reduces the problem to linear algebra. Conveniently, we 1 I 

—f- 1 ^ z 

will be able to keep track of imaging effects with a 2 x 2 matrix formalism. z i z 2 

Consider a ray propagating in the y-z plane where the optical axis is in the z- Figure 9 7 The behavior of a ray as 
direction. Let us specify a ray at position z\ by two coordinates: the displacement light traverses a distance d. 
from the axis y\ and the orientation angle 9\ (see Fig. 9.7). If the index is uniform 
everywhere, the ray travels along a straight path. It is straightforward to predict the 
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coordinates of the same ray down stream, say at zz- First, since the ray continues 
in the same direction, we have 

6 2 = 0i (9.25) 



By referring to Fig. 9.7 we can write y>2 in terms of y\ and d\\ 

}>2 = yi + <itan0i 



(9.26) 



where d = Z2 - Z\. Equation (9.26) is nonlinear in Q\. However, in the paraxial 
approximation (9.24) becomes linear, which after all is the point of the approxi- 
mation. In this approximation the expression for 3/2 simplifies to 



y2 = yi + l d 



(9.27) 



ABCD matrix for propagation 
through a distance d 







Id 






82 . 




1 




0i 



Equations (9.25) and (9.27) describe a linear transformation which in matrix 
notation can be consolidated into the form 



(9.28) 



Here, the vectors in this equation specify the essential information about the ray 
before and after traversing the distance d, and the matrix describes the effect of 
traversing the distance. This type of matrix is called an ABCD matrix;, 6 sometimes 
physicists are not very inventive with names. 



Example 9.4 

Let the distance d be subdivided into two distances, a and b, such that d = a + 
b. Show that an application of the ABCD matrix for distance a followed by an 
application of the ABCD matrix for b renders same result as an application of the 
ABCD matrix for distance d. 



Solution: Individually, the effects of propagation through a and through b are 



Vmid 




1 a 






and 


yi 




1 b 




ymld 


Omid _ 




1 




81 . 


6 2 . 




1 




#mid 



(9.29) 



where the subscript "mid" refers to the ray in the middle position after traversing 
the distance a. If we combine the equations, we get 



yi 




1 


b 




1 a 




yi 


02 , 







1 




1 







(9.30) 



which is in agreement with (9.28) since the ABCD matrix for the entire displace- 
ment is 



A 


B 




1 


b 




1 a 




C 


D 







1 




1 





6 P. W. MilonniandJ. H. Eberly, Lasers, Sect. 14.2 (New York: Wiley, 1988). 
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9.4 Reflection and Refraction at Curved Surfaces 

We next consider the effect of reflection from a spherical surface as depicted in 
Fig. 9.8. We consider only the act of reflection without considering propagation 
before or after the reflection takes place. Thus, the incident and reflected rays 
in the figure are symbolic only of the direction of propagation before and after 
reflection; they do not indicate any amount of travel. We immediately write 



y2 = yi 



(9.32) 



since the ray has no chance to go anywhere. 

We adopt the widely used convention that, upon reflection, the positive z- 
direction is reoriented so that we consider the rays still to travel in the positive 
z sense. An easy way to remember this is that the positive z direction is always 
taken to be down stream of where the light is headed. Notice that in Fig. 9.8, the 
reflected ray approaches the z-axis. In this case O2 is a negative angle (as opposed 
to 6\ which is drawn as a positive angle) and is equal to 



2 = - (01+200 



(9.33) 



where 0j is the angle of incidence with respect to the normal to the spherical 
mirror surface. By the law of reflection, the incident and reflected ray both occur 
at an angle 6[ referenced to the surface normal. The surface normal points towards 
the center of curvature of the mirror surface, which we assume is on the z-axis a 
distance R away. By convention, the radius of curvature R is a positive number 
if the mirror surface is concave and a negative number if the mirror surface is 
convex. 



Elimination of 0j from (9.33) in favor of 0\ and y\ 

By inspection of Fig. 9.8 we can write 



yi 

R 



■ sirup = <p 



(9.34) 



where we have applied the paraxial approximation (9.23). (The angles in Fig. 9.8 
are exaggerated. In fact, when <p is small enough for (9.34) to hold, we may also 
neglect the small distance 5.) By inspection of the geometry, we also have 



= 0i + Gi 

and when this is combined with (9.34), we get 

R 

With this we are able to put (9.33) into a useful linear form: 



02 = --yi + 0i 

K 



(9.35) 



(9.36) 



(9.37) 




Figure 9.8 A ray depicted in the 
act of reflection from a spherical 
surface. 
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ABCD matrix for a curved mirror 




Figure 9.9 A ray depicted in the 
act of transmission at a curved 
material interface. 
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o 2 , 




-21R 1 




0i 



Equations (9.32) and (9.37) describe a linear transformation that can be con- 
cisely formulated as 

r i/„ 1 r i n 1 r 1/, 1 

(9.38) 

The ABCD matrix in this transformation describes the act of reflection from a 
concave mirror with radius of curvature R. The radius R is negative when the 
mirror is convex. 

The final basic element that we shall consider is a spherical interface between 
two materials with indices n\ and n t (see Fig. 9.9). This has an effect similar to 
that of the curved mirror, which changes the direction of a ray without altering 
its distance y\ from the optical axis. Please note that here the radius of curvature 
is considered to be positive for a convex surface (opposite convention from that 
of the mirror) . In this way, if the lower index is on the left, a positive radius R for 
either the interface or the mirror tends to deflect rays towards the axis. Again, we 
are interested only in the act of transmission without any travel before or after 
the interface. As before, (9.32) applies (i.e. yz = yi). 

At the interface, the rays obey Snell's obeys, which in the paraxial approxima- 
tions is written 

mOi = n t d t (9.39) 
The angles 6i and t are referenced from the surface normal, as seen in Fig. 9.9. 



Substituting 8\, 62 and y\ into Snell's Law 

By inspection of Fig. 9.9, we have 

0i = Gi + (p 

and 



0t = 02 + 4> 



(9.40) 



(9.41) 



where <p is the angle that the surface normal makes with the z-axis. As before (see 
(9.34)) , within the paraxial approximation we may write 

(p = yi/R 

When this is used in (9.40) and (9.41), which are substituted into (9.39), Snell's law 
becomes 

#2 = 



R 



(9.42) 



The compact matrix form of (9.32) and (9.42) is written 



ABCD matrix for a curved 
interface 



yi 




02 , 





1 

(ni/n t -l) IR njnt 



yi 
61 



(9.43) 
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9.5 ABCD Matrices for Combined Optical Elements 

To summarize the previous two sections, we have developed ABCD matrices for 
three basic elements: 1) propagation through a region of uniform index (9.28), 
2) reflection from a curved mirror (9.38), and 3) transmission through a curved 
interface between regions with different indices (9.43). All other ABCD matrices 
that we will use are composites of these three. For example, one can construct the 
ABCD matrix for a lens by using two matrices like those in (9.43) to represent the 
entering and exiting surfaces of the lens. A distance matrix (9.28) can be inserted 
to account for the thickness of the lens. It is left as an exercise to derive the ABCD 
matrix for a thick lens (seeP9.6). 



Example 9.5 

Derive the ABCD matrix for a thin lens, where the thickness between the two lens 
surfaces is ignored. (See P 9.6 for the more general case of a thick lens.) 

Solution: A thin lens is depicted in Fig. 9.10. R\ is the radius of curvature for the 
first surface (which is positive if convex as drawn), and R2 is the radius of curvature 
for the second surface (which is negative as drawn). For either surface, the radius 
of curvature is considered to be positive if the surface is convex from the perspective 
of rays that encounter it. 

We take the index outside of the lens to be unity while that of the lens material to 
be n. We apply the ABCD matrix (9.43) in sequence, once for entering the lens and 
once for exiting: 



A B 
C D 



1 







[n- 



Ri R2 



h) 



ir(;-i) 





1 



The matrix for the first interface is written on the right, where it operates first on 
an incoming ray vector. In this case, ni = 1 and n t -n. The matrix for the second 
surface is written on the left so that it operates afterwards. For the second surface, 
Hi = n and n t = 1. 



Notice the close similarity between the ABCD matrix for a thin lens (9.44) and 
the ABCD matrix for a curved mirror (9.38) . The ABCD matrix for either the thin 
lens or the mirror can be written as 



A B 




1 


C D 




-11 f 1 



(9.45) 



where in the case of the thin lens the focal length is given by the lens maker's 
formula 

1 1 1 n 

- = (n-l) (focal length of thin lens) (9.46) 



Distance within a material, ex- 
cluding interfaces 

1 d 
1 

Window, starting and stopping in 
air 

1 din 

1 

Thin lens or Mirror 



1 

"I// 



Thin Lens: 4 = (n - 1) - j 
Mirror: 



1 

f~R 
Thick lens 

ill 



2-i-n 1- 



«2 1 



Table 9.1 Summary of ABCD matri- 
ces for common optical elements. 



(9.44) ABCD matrix for a thin lens 









n ! 


Figure 9.10 Thin lens. 
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Figure 9.11 Window. 



and in the case of a curved mirror, the focal length is 

f = R/2 (focal length for a curved mirror) (9.47) 

Table 9.1 is a summary of ABCD matrices of common optical elements. 

Example 9.6 

Derive the ABCD matrix for a window with thickness d and index n. 

Solution: We can again take advantage of the ABCD matrix for a curved interface 
(9.43), only in this problem we will let R\ — oo and R2-00X0 provide flat surfaces. 
We take the index outside of the window to be unity and the index inside the 
window to be n. We use the ABCD matrix (9.43) twice, once for each interface, 
sandwiching matrix (9.31), which endows the window with thickness: 



A B 
C D 



1 

n 

1 din 
1 



1 d 
1 



(9.48) 



(window) 



As far as rays are concerned, a window is effectively shorter to traverse than free 
space. 7 Fig. 9.1 1 illustrates why this is the case. The displacement of the exiting ray 
is not as great as it would have been without the window. The window impedes 
the rate at which the ray can move away from or toward the optical axis. 




Figure 9.12 A ray that travels 
through a distance a, reflects from 
a mirror, and then travels through 
a distance b. 



Example 9.7 

Find ray [ ^ 1 ma t results when [ gj J propagates through a distance a, reflects from 
a mirror of radius R, and then propagates through a distance b. See Fig. 9.12. 

Solution: The final ray in terms of the initial one is computed as follows: 



y2 




1 b 




1 




1 a 




yi 






1 




-21 R 1 




1 




0i 



l-2b/R a+b-2ablR 
-2IR \-2alR 



81 



(9.49) 



(1 - 2b IR) yi + {a+b- 2ablR) 0i 
{-21 R) y 1 + a-2a/R)9i 

As always, the ordering of the matrices is important. The first effect that the ray 
experiences is represented by the matrix on the right, which is in the position that 
first operates on [ gj ] ■ 



7 In contrast, the optical path length OPL is effectively longer than free space by the factor n. 
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We have derived our basic ABCD matrices for rays traveling in the y-z plane, 
as suggested in Figs. 9.7-9.12. This may have given the impression that it is 
necessary to work within a plane that contains the optical axis (i.e. the z-axis 
in our case). However, within the paraxial approximation, the ABCD matrices 
are valid for rays that become displaced simultaneously in both the x and y 
dimensions during propagating along z. 

As we demonstrate below, the behavior of rays functions independently in 
the x and y dimensions. If desired, one can write a ray vector for each dimen- 
sion, namely [ g x ] and g . Moreover, the identical matrices, for example any 
in table 9.1, are used for either dimension. Figs. 9.7-9.12 therefore represent 
projections of rays onto the y-z plane. To complete the story, one can imagine 
corresponding figures representing the projection of the rays onto the x-z plane. 

Independence of Rays in the x and y Dimensions 

Imagine a ray contained within a plane that is parallel to the y-z plane but for 
which x> 0. One might be concerned that when the ray meets, for example, a 
spherically concave mirror, the radius of curvature in the perspective of the y-z 
dimension might be different for x > than for x — (at the center of the mirror). 
This concern is actually quite legitimate and is the source of what is known as 
spherical aberration. Nevertheless, in the paraxial approximation the intersection 
with the curved mirror of all planes that are parallel to the optical axis gives the 
same curve. 

To see why this is so, consider the curvature of the mirror in Fig. 9.8. As we 
move away from the mirror center (in the x or y-dimension or some combination 
thereof), the mirror surface deviates to the left by the amount 

8 = R-Rcos(j) (9.50) 

In the paraxial approximation, we have cose/) = 1 - (p 2 /2. And since in this approxi- 
mation we may also write <p = y/x 2 + y 2 I R, (9.50) becomes 



2 2 
X V 

6= — + — (9.51) 
2R 2R 

In the paraxial approximation, we see that the curve of the mirror is parabolic, and 
therefore separable between the x and y dimensions. That is, the curvature in 
the x-dimension (i.e. 35/ dx = xIR) is independent of y, and the curvature in the 
y-dimension (i.e. dSldy - ylR) is independent of x. A similar argument can be 
made for a spherical interface between two media. 



9.6 Image Formation 

Consider Example 9.7 where a ray travels a distance a, reflects from a curved 
mirror, and then travels a distance b. From (9.49), the ABCD matrix for the overall 




Galileo di Vincenzo Bonaiuti de' 
Galilei (1564-1642, Italian) was born 
in Pisa, Italy, the son of a musician. 
Galileo enrolled in the University of Pisa 
with the intent to study medicine but 
soon became diverted into mathematics. 
He served three years as chair of math- 
ematics in Pisa beginning in 1589 and 
then moved to the University of Padua 
where he taught geometry, mechanics, 
and astronomy for two decades. While 
Galileo did not invent the telescope, he 
considerably improved the design. With 
it he discovered four moons of Jupiter 
and was the first to observe sunspots 
and mountains and valleys on the Moon. 
Galileo also was the first to document 
the phases of Venus, similar to the 
phases of the moon. He used these 
observations to argue in favor of the 
Copernican model of the solar system, 
but this conflicted with the prevailing 
views of the Catholic Church at the 
time, and he was placed under house 
arrest and forbidden to publish of any 
of his works. While under house arrest, 
he wrote much on kinematics and other 
principles of physics and is considered to 
be the father of modern physics. Galileo 
attempted to measure the speed of 
light by observing an assistant uncover 
a lantern on a distant hill in response 
to a light signal. He concluded that 
light is "really fast" if not instantaneous. 
(Wikipedia) 
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Figure 9.13 Image formation by a 
thin lens. 
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(9.52) 



process is 

l-b/f a+b-ab/f 
-II f l-a/f 

where by (9.47) we have replaced 21 R with 1/ /. Because of the similarity between 
the behavior of a curved mirror and a thin lens, the above expression can also 
represent a ray traveling a distance a, traversing a thin lens with focal length /, 
and then traveling a distance b. The only difference is that, in the case the thin 
lens, / is given by lens maker's formula (9.46). 

As is well known, it is possible to form an image with either a curved mirror 
or a lens. Suppose that the initial ray is one of many rays that leaves a particular 
point on an object positioned a = d before the mirror (or lens). In order for an 
image to occur at d[ = b, it is essential that all rays leaving the particular point on 
the object converge to a corresponding point on the image. That is, we want rays 
leaving the point yi on the object (which may take on a range of angles Q\) all to 
converge to a single point 3/2 at the image. In the following equation we need yi 
to be independent of 6\ : 
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C 
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Ayi +BGi 
Cyi + Ddx 



(9.53) 



The condition for image formation is therefore 
image B = (condition for image formation) (9.54) 

When this condition is applied to (9.52), we obtain 



, , d di 111 

d + di — = 0^>- = — + — 

/ / d d[ 



(9.55) 



which is the familiar imaging formula (9.1). When the object is infinitely far away 
(i.e. d — > 00), the image appears at d\ -* f. This gives a physical interpretation 
to the focal length f, as we have been calling it. Please note that d and di can 
each be either positive [real as depicted in Fig. 9.13) or negative {virtual meaning 
a screen cannot be inserted to display the image) . 

The magnification of the image is found by comparing the size of yz to y\. 
From (9.52)-(9.55), the magnification is found to be 



V2 di di 
M= *± = A=1- — = 

yi f d 



(9.56) 



The negative sign indicates that for positive distances d and di the image is 
inverted. 

In the above discussion, we have examined image formation by a thin lens 
or a curved mirror. Of course, images can also be formed by thick lenses or by 
more complex composite optical systems (e.g. a system of lenses and spaces). 
The ABCD matrices for the elements in a composite system are simply multiplied 
together (the first element that rays encounter appearing on the right) to obtain an 



9.6 Image Formation 



243 



overall ABCD matrix. The principles for image formation with an arbitrary ABCD 
matrix are the same as those for a thin lens or curved mirror. As before, consider 
propagation a distance d from an object to the optical element followed by 
propagation a distance di to an image. The ABCD matrix for the overall operation 
is 



1 di 
1 



A B 
C D 



1 d 
1 



A + diC d A + B + d diC + diD 

C d C + D 

A' B' 

a d' 



(9.57) 



An image occurs according to (9.54) when B' = 0, or 

d A + B + d diC + diD = 0, 

with magnification 

M = A+diC 



(9.58) 



(9.59) 



For a complex lens system, the matrix elements A, B, C and D can be complicated 
expressions. There is a convenient way to simplify the analysis, which is discussed 
in the next section. 



Example 9.8 

Beginning students are often taught to draw ray diagrams such as the one in Fig. 
9.14, which shows a real image formed by a thin lens. Several key rays aid in a 
graphic prediction of the location and size of the image. Use ABCD-matrix analysis 
to describe the effect of the lens on the three rays drawn. 




Figure 9.14 Formation of a real image by a thin lens. 



general condition for image 
formation 



Solution: Ray A is parallel to the axis with height y\ before traversing the lens. Just 
after the lens, ray A is described by 



Y2 
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-yi// . 



which crosses the axis at the focus d-f, since \ \ ( yi \ = \ 

[ 1 ] [ -yi/f \ [ -yi 
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Meanwhile, ray B traverses the lens just where it crosses the axis. The lens 
does nothing to this ray: 



f y2 ) 
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Ray B is un-deflected. 

Finally, ray C, which goes through the point d = -/before the lens, becomes 
parallel to the axis following the lens: 
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Note that starting from the left focus, we have just before the lens 
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Figure 9.15 A multi- element sys- 
tem represented as an ABCD ma- 
trix for which principal planes 
always exist. 



9.7 Principal Planes for Complex Optical Systems 

For every ABCD matrix representing an optical system, there exist two principal 
planes, 8 located (in our convention) a distance p\ before entering the system and 
a distance P2 after exiting the system. When the matrices corresponding to these 
(appropriately chosen) distances are appended to the original ABCD matrix of 
the system, the overall matrix simplifies to one that looks identical to the matrix 
for a simple thin lens (9.45). 

With knowledge of the positions of the principal planes, one can treat the 
complicated imaging system in the same way that one treats a simple thin lens. 
That is, we can simply use the common formulas (9.55) and (9.56). The only 
difference is that d is the distance from the object to the first principal plane and 
d[ is the distance from the second principal plane to the image. In the case of an 
actual thin lens, both principal planes are at p\ = p2 = 0. For a composite system, 
pi and P2 can be either positive or negative. 

We assert that for any optical system, 9 p\ and P2 can always be selected such 
that we can write 



1ft' 




A B 




1 Pi 




A+p 2 C 


PiA + B + p 1 p 2 C+p 2 D 


1 




C D 




1 




C 


piC + D 



1 

l//eff 



(9.60) 

The final matrix is that of a simple thin lens, and it takes the place of the composite 
system including the distances to the principal planes. 



b R. Guenther, Modern Optics, p. 186 (New York: Wiley, 1990). 
9 The starting and ending refractive index must be the same. 
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Determination of p\ and p2 and Justification of (9.60) 



Our task is to find the values of p\ and p 2 that make (9.60) true. We can straight- 
away make the definition 

fats -1/C (9.61) 

We can also solve for p\ and pz by setting the diagonal elements of the matrix to 1. 
Explicitly, we get 

l-D 

p x C+D=l Pl = —^- (9.62) 



and 



A+p 2 C=l 



C 
1-A 



P2 = 



C 



(9.63) 



It remains to be shown that the upper right element in (9.60) (i.e. piA + B + 
P1P2C+ P2D) automatically goes to zero for our choices of p\ and p2- This may 
seem unlikely at first, but watch what happens! 

When (9.62) and (9.63) are substituted into the upper right matrix element of (9.60) 
we get 

l-D l-Dl-A 1-A 
p±A + B + pip 2 C + p 2 D = — —A+B+— —C+^—D 



C 



C C 



c 



[1-AD + BC] 

A B 
C D 



(9.64) 



This vanishes (as desired) if the determinant of the original ABCD matrix equals 
one. Fortunately, this is always the case as long as we begin and end in the same 
index of refraction: 

A B 
C D 



1 



(9.65) 



Notice that the determinants of all of the matrices in table 9.1 are one. Moreover, 
ABCD matrices constructed of these will also have determinants equal to one. 10 



9.8 Stability of Laser Cavities 

The ABCD matrix formulation provides a powerful tool to analyze the stability of a 
laser cavity. 11 The basic elements of a laser cavity include an amplifying medium 
and mirrors to provide feedback. Presumably, at least one of the end mirrors is 
partially transmitting so that energy is continuously extracted from the cavity. 
Here, we dispense with the amplifying medium and concentrate our attention on 
the optics providing the feedback. 

10 The determinant of (9.43) is not one since it starts and ends with different indices of refraction. 
However, when this matrix is used in succession to form a lens, the resulting matrix has determinant 
equal to one. 

1 1 E W. Milonni and J. H. Eberly, Lasers, Sect. 14.3 (New York: Wiley, 1988) . 
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Figure 9.16 (a) A ray bouncing 
between two parallel flat mirrors, 
(b) A ray bouncing between two 
curved mirrors in an unstable 
configuration, (c) A ray bouncing 
between two curved mirrors in a 
stable configuration, (d) Stable 
cavity utilizing a lens and two flat 
end mirrors. 



As might be expected, the mirrors must be carefully aligned or successive 
reflections might cause rays to 'walk' continuously away from the optical axis, 
so that they eventually leave the cavity out the side. If a simple cavity is formed 
with two flat mirrors that are perfectly aligned parallel to each other, one might 
suppose that the mirrors would provide ideal feedback. However, all rays except 
for those that are perfectly aligned to the mirror surface normals would eventually 
wander out of the side of the cavity as illustrated in Fig. 9.16a. Such a cavity is said 
to be unstable. We would like to do a better job of trapping the light in the cavity. 

To improve the situation, a cavity can be constructed with concave end mir- 
rors to help confine the beams within the cavity. Even so, one must choose 
carefully the curvature of the mirrors and their separation L. If this is not done 
correctly, the curved mirrors can 'overcompensate' for the tendency of the rays 
to wander out of the cavity and thus aggravate the problem. Such an unstable 
scenario is depicted in Fig. 9.16b. 

Figure 9.16c depicts a cavity made with curved mirrors where the separation 
L is chosen appropriately to make the cavity stable. Although a ray, as it makes 
successive bounces, can strike the end mirrors at a variety of points, the curvature 
of the mirrors keeps the 'trajectories' contained within a narrow region so that 
they cannot escape out the sides of the cavity. 

There are many ways to make a stable laser cavity. For example, a stable cavity 
can be made using a lens between two fiat end mirrors as shown in Fig. 9.16d. Any 
combination of lenses (perhaps more than one) and curved mirrors can be used 
to create stable cavity configurations. Ring cavities can also be made to be stable 
where in no place do the rays retro-refiect from a mirror but circulate through 
a series of elements like cars going around a racetrack. The ABCD matrix for a 
round trip in the cavity will be useful for this analysis. 

Example 9.9 

Find the round- trip ABCD matrix for the cavities shown in Figs. 9.16c and 9.16d. 
Solution: The round-trip ABCD matrix for the cavity shown in Fig. 9.16c is 



(9.66) 



where we have begun the round trip just after a reflection from the first mirror. 
The round-trip ABCD matrix for the cavity shown in Fig. 9.16d is 
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(9.67) 



where we have begun the round trip just after a transmission through the lens 
moving to the right. It is somewhat arbitrary where a round trip begins. The 
multiplication on the above matrices will need to be carried out to do problems 
P9.15and P9.16. 
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To determine whether a given configuration of a cavity is stable, we need to 
know what a ray does after making many round trips in the cavity. To find the 
effect of propagation through many round trips, we multiply the round- trip ABCD 
matrix together N times, where N is the number of round trips that we wish to 
consider. We can then examine what happens to an arbitrary ray after making N 
round trips in the cavity as follows: 



(9.68) 
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At this point you might be concerned that taking an ABCD matrix to the AT* 
power can be a lot of work. (It is already a significant work just to compute the 
ABCD matrix for a single round trip.) In addition, we are interested in letting N 
be very large, perhaps even infinity. You can relax because we have a neat trick to 
accomplish this daunting task. 

By Sylvester's theorem in appendix 0.3, we have 
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sin0 



AsmNd-sm{N-l)d BsinNB 
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(9.69) 



where 



cos0 = -{A + D). 
2 



(9.70) 



This is valid as long as the determinant of the ABCD matrix is one. As noted 
earlier (see (9.65)), we are in luck! The determinant is one any time a ray begins 
and stops in the same refractive index, which by definition is guaranteed for any 
round trip. We therefore can employ Sylvester's theorem for any N that we might 
choose, including very large integers. 

We would like the elements of (9.69) to remain finite as N becomes very large. 
If this is the case, then we know that a ray remains trapped within the cavity 
and stays reasonably close to the optical axis. Since N only appears within the 
argument of a sine function, which is always bounded between - 1 and 1 for 
real arguments, it might seem that the elements of (9.69) always remain finite 
as N approaches infinity. However, it turns out that can become imaginary 
depending on the outcome of (9.70), in which case the sine becomes a hyperbolic 
sine, which can 'blow up' as N becomes large. In the end, the condition for cavity 
stability is that a real 6 must exist for (9.70), or in other words we need 



1 

1 <-{A + D) < 1 



(condition for a stable cavity) 



(9.71) 



It is left as an exercise to apply this condition to (9.66) and (9.67) to find the 
necessary relationships between the various element curvatures and spacing in 
order to achieve cavity stability. 
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Appendix 9.A Aberrations and Ray Tracing 




Figure 9.17 (a) Paraxial theory pre- 
dicts that the light imaged from 
a point source will converge to 
a point (i.e. have spherical wave 
fronts coming to the image point), 
(b) The image of a point source 
made by a real lens with aberra- 
tions is an extended and blurred 
patch of light and the converg- 
ing wavefronts are only quasi- 
spherical. 




Figure 9.18 Ray tracing through a 
simple lens. 



The paraxial approximation places serious limitations on the performance of 
optical systems (see (9.23) and (9.24)). To stay within the approximation, all rays 
traveling in the system should travel very close to the optic axis with very shallow 
angles with respect to the optical axis. To the extent that this is not the case, the 
collection of rays associated with a single point on an object may not converge to 
a single point on the associated image. The resulting distortion or "blurring" of 
the image is known as aberration. 

Common experience with photographic and video equipment suggests that 
it is possible to image scenes that have a relatively wide angular extent (many 
tens of degrees), in apparent serious violation of the paraxial approximation. 
The paraxial approximation is indeed violated in these devices, so they must be 
designed using more complicated analysis techniques than those we have learned 
in this chapter. The most common approach is to use a computationally intensive 
procedure called ray tracing in which sin0 and tan0 are rendered exactly. The 
nonlinearity of these functions precludes the possibility of obtaining analytic 
solutions describing the imaging performance of such optical systems. 

The typical procedure is to start with a collection of rays from a test point such 
as shown in Fig. 9.18. Each ray is individually traced through the system using 
the exact representation of geometric surfaces as well as the exact representation 
of Snell's law. On close analysis, the rays typically do not converge to a distinct 
imaging point. Rather, the rays can be 'blurred' out over a range of points where 
the image is supposed to occur. Depending on the angular distribution of the 
rays as well as on the elements in the setup, the spread of rays around the image 
point can be large or small. The engineer who designs the system must determine 
whether the amount of aberration is acceptable, given the various constraints of 
the device. 

To minimize aberrations below typical tolerance levels, several lenses can 
be used together. If properly chosen, the lenses (some positive, some negative) 
separated by specific distances, can result in remarkably low aberration levels 
over certain ranges of operation for the device. Ray tracing is best done with 
commercial software designed for this purpose (e.g. Zemax or other professional 
products). Such software packages are able to develop and optimize designs for 
specific applications. A nice feature is that the user can specify that the design 
should employ only standard optical components available from known optics 
companies. In any case, it is typical to specify that all lenses in the system should 
have spherical surfaces since these are much less expensive to manufacture. We 
mention briefly a few types of aberrations that you may encounter. Multiple 
aberrations can often be observed in a single lens. 

Chromatic aberration arises from the fact that the index of refraction for glass 
varies with the wavelength of light. Since the focal length of a lens depends on 
the index of refraction (see, for example, Eq. (9.46)), the focal length of a lens 
varies with the wavelength of light. Chromatic aberration can be compensated 
for by using a pair of lenses made from two types of glass as shown in Fig. 9.19 
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(the pair is usually cemented together to form a "doublet" lens). The lens with the 
shortest focal length is made of the glass whose index has the lesser dependence 
on wavelength. By properly choosing the prescription of the two lenses, you 
can exactly compensate for chromatic aberration at two wavelengths and do a 
good job for a wide range of others. Achromatic doublets can also be designed to 
minimize spherical aberration (see below), so they are often a good choice when 
you need a high quality lens. 

Monochromatic aberrations arise from the shape of the lens rather than the 
variation of n with wavelength. Before the advent computers facilitated the 
widespread use of ray tracing, these aberrations had to be analyzed primarily 
with analytic techniques. The analytic results derived previously in this chapter 
were based on first order approximations (e.g. sin0 « 6). This analysis predicts 
that a lens can image a point source to an exact image point, which predicts 
spherically converging wavefronts at the image point as shown in Fig. 9.17(a). You 
can increase the accuracy of the theory for non-paraxial rays by retaining second- 
order correction terms in the analysis. With these second-order terms included, 
the wave fronts converging towards an image point are mostly spherical, but have 
second-order aberration terms added in (shown conceptually in Fig. 9.17(b)). 
There are five aberration terms in this second-order analysis, and these represent 
a convenient basis for discussing aberration. 

The first aberration term is known as spherical aberration. This type of aber- 
ration results from the fact that rays traveling through a spherical lens at large 
radii experience a different focal length than those traveling near the axis. For a 
converging lens, this causes wide-radius rays to focus before the near-axis rays 
as shown in Fig. 9.20. This problem can be helped by orienting lenses so that 
the face with the least curvature is pointed towards the side where the light rays 
have the largest angle. This procedure splits the bending of rays more evenly 
between the front and back surface of the lens. As mentioned above, you can also 
cement two lenses made from different types of glass together so that spherical 
aberrations from one lens are corrected by the other. 

The aberration term referred to as astigmatism occurs when an off-axis object 
point is imaged to an off-axis image point. In this case a spherical lens has a 
different focal length in the horizontal and vertical dimensions. For a focusing 
lens this causes the two dimensions to focus at different distances, producing a 
vertical line at one image plane and a horizontal line at another. A lens can also be 
inherently astigmatic even when viewed on axis if it is football shaped rather than 
spherical. In this case, the astigmatic aberration can be corrected by inserting a 
cylindrical lens at the correct orientation (this is a common correction needed in 
eyeglasses) . 

A third aberration term is referred to as coma. This is observed when off-axis 
points are imaged and produces a comet shaped tail with its head at the point 
predicted by paraxial theory. (The term 'coma' refers to the atmosphere of a 
comet, which is how the aberration got its name.) This aberration is distinct from 
astigmatism, which is also observed for off-axis points, since coma is observed 





low dispersion 
glass 



high dispersion 
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Figure 9.19 Chromatic aberration 
causes lenses to have different 
focal lengths for different wave- 
lengths. It can be corrected using 
an achromatic doublet lens. 




Figure 9.20 Spherical aberration 
in a plano-convex lens. 
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Figure 9.21 Illustration of coma. Rays traveling through the center of the lens are im- 
aged to point a as predicted by paraxial theory. Rays that travel through the lens at 
radius Pi, in the plane of the figure are imaged to point b. Rays that travel through the 
lens at radius p^,, but outside the plane of the figure are imaged to other points on the 
circle (in the image plane) containing point b. Rays at that travel through the lens at 
other radii on the lens (e.g. p c ) also form circles in the image plane with radius propor- 
tional to p 2 with the center offset from point a a distance proportional to p 2 . When 
light from each of these circles combines on the screen it produces an imaged point 
with a "comet tail." 



Barrel Distortion 



— 










— 































































Pincushion Distortion 

Figure 9.22 Distortion occurs 
when magnification is not con- 
stant across an extended image. 



even when all of the rays are in one plane (see Fig. 9.21). You have probably seen 
coma if you've ever played with a magnifying glass in the sun — just tilt the lens 
slightly and you see a comet-like image rather than a point. 

The curvature of the field aberration term arises from the fact that spherical 
lenses image spherical surfaces to another spherical surface, rather than imaging 
a plane to a plane. This is not so bad for your eyeball, which has a curved screen, 
but for things like cameras and movie projectors we would like to image to a flat 
screen. When a flat screen is used and the curvature of the field aberration is 
present, the image will be focus well near the center, but become progressively 
out of focus as you move to the edge of the screen (i.e. the flat screen is farther 
from the curved image surface as you move from the center) . 

The final aberration term is referred to as distortion. This aberration occurs 
when the magnification of a lens depends on the distance from the center of 
the screen. If magnification decreases as the distance from the center increases, 
then 'barrel' distortion is observed. When magnification increases with distance, 
'pincushion' distortion is observed (see Fig. 9.22). 

All lenses will exhibit some combination of the aberrations listed above (i.e. 
chromatic aberration plus the five second-order aberration terms). In addition to 
the five named monochromatic aberrations, there are many other higher order 
aberrations that also have to be considered. Aberrations can be corrected to a high 
degree with multiple-element systems (designed using ray- tracing techniques) 
composed of lenses and irises to eliminate off-axis light. For example, a camera 
lens with a focal length of 50 mm, one of the simplest lenses in photography, is 
typically composed of about six individual elements. However, optical systems 
never completely eliminate all aberration, so designing a system always involves 
some degree of compromise in choosing which aberrations to minimize and 
which ones you can live with. 
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Exercises for 9.1 The Eikonal Equation 

P9.1 Consider the index described in Example 9.1. The solution given in 
the example corresponds to rays that asymptotically approach y = 0. A 
more general solution is given by 

VR = n \x\/l + a±yyJy 2 /h 2 -aj (l + a > and y 2 lh 2 - a > 0) 

This corresponds to rays that either hit the ground or return toward the 
sky without reaching the ground, depending on the sign of a. 

(a) Verify that VR satisfies the eikonal equation and determine the 
function R [x, y] . 

HINT: / dty/t 2 - a = § Vt 2 ~ a - § In U + y/{ 2 - a) (£-a>0). 

(b) Verify that the light path is given by y = h\fa cosh|^==j when 
a > and is given by y = fav/loTsinh x ~jL when a < 0. Consider only 

° J J hvl+a J 

the region y > (i.e. above ground). Notice that these solutions can 

make rays that travel either to the right or to the left. 

HINT: cosh 2 sinh 2 £ = 1 |? cosh£ = sinh£ ^ sinhf = coshf . 

(c) Make a sketch of these two solution classes in the case of a = +4. 

P9.2 Prove that under the approximation of very short wavelength, the 
Poynting vector is directed along VR (r) or s. 



Solution: (partial) 

First, from Faraday's law (1.36) we have 

B(r, t) = -V x fE (r)e , ' (fc ™- R(r) - ft,r) l 

Applying the identity V x (ay) = i//(V x a) + Vty x a to this equation, we obtain: 

B(r, t) = - L»'(*rac«(r)-«>f) [V x E (r)] + (fc vac e , ' ( * :vacfl(r) " tt " ) [VR(r) x E (r)]l 
(O I > 

= «^vac gilfc.acBW-a.t] [y x Eq (r)] _ \ e i[k mc R{r)-a>t\ [yR(r) x E()(r)] 
27TC C 

The first term vanishes in the limit of very short wavelength, and we have: 

B(r, t) - -- [VR(r)] x E (r) e'^vac-RM-wi] ^ (g ?2) 

c 

Next, from Gauss's law (1.34) and the constitutive relation (2.16) we have 
V-[(l + I (r))Eo(r) e '' (fc ™ fl(r) - ft,i) ] =0 
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Applying the identity V ■ (atyO = a - Vifi + y/V ■ a to this expression yields: 

e i(fcWiM-<«f) v . [ (1 + x W ] E() W ] + ^^(^HCD-at) (1 + ^ [VR(J ., . Eo(r)] = 

Canceling the common exponential term, using & vac = 2nl A v ac. and some algebra then gives 

V-[(l + I (r))Eo(r)] 
-i A vac —f- — r + Vfl(r) ■ E (r) = 

In the limit of very short wavelength, this becomes 

VR(r)-E (r)-0 (9.73) 

Finally, compute the time average of the Poynting vector 
1 

S= — Re{E(r,f)} x Re{B(r, f)) 
Mo 

= — [E(r,f)+E*(r, t)l x [B(r, t) + B* (r, f)l 
You will need to employ expressions (9.72) and (9.73), as well as the BAC-CAB rule (see P0.3). 



Exercises for 9.2 Fermat's Principle 

P9.3 Use Fermat's Principle to derive the law of reflection (3.6) for a reflective 
surface. 




HINT: Do not consider light that goes directly from A to B; require a 
single bounce. 

P9.4 Show that Fermat's Principle fails to give the correct path for an extraor- 
dinary ray entering a uniaxial crystal whose optic axis is perpendicular 



< kx l x Ax r > to the surface. 

HINT: With the index given by (5.29), show that Fermat's principle 
Figure 9.23 leads to an answer that neither agrees with the direction of the k-vector 

(5.32) nor with the direction of the Poynting vector (5.40). 



Exercises for 9.4 Reflection and Refraction at Curved Surfaces 

P9.5 Derive the ABCD matrix that takes a ray on a round trip through a 
simple laser cavity consisting of a flat mirror and a concave mirror of 
radius R separated by a distance L. HINT: Start at the flat mirror. Use 
the matrix in (9.28) to travel a distance L. Use the matrix in (9.38) to 
represent reflection from the curved mirror. Then use the matrix in 
(9.28) to return to the flat mirror. The matrix for reflection from the flat 
mirror is the identity matrix (i.e. R aat — oo). 
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P9.6 



Derive the ABCD matrix for a thick lens made of material ri2 sur- 
rounded by a liquid of index n\ . Let the lens have curvatures R\ and R2 
and thickness d. 

Answer: 



\ A 


B 1 


[ c 


D \ ~ 



1- 



1 



d_[ni 



51 

"2 



«2 
"1 J 



"2 

7? 2 



Exercises for 9.6 Image Formation 

P9.7 (a) Show that the ABCD matrix for a thick lens (see P9.6) reduces to that 
of a thin lens (9.45) when the thickness goes to zero. Take the index 
outside of the lens to be n\ = 1. 

(b) Find the ABCD matrix for a thick window (thickness d) . Take the 
index outside of the window to be n\ - 1. HINT: A window is a thick 
lens with infinite radii of curvature. 

P9.8 An object is placed in front of a concave mirror. Find the location of 
the image d[ and magnification M when d = R, d = R/2, d = R/4, 
and d = -R/2 (virtual object). Make a diagram for each situation, 
depicting rays traveling from a single off-axis point on the object to 
a corresponding point on the image. You may want to emphasize 
especially the ray that initially travels parallel to the axis and the ray 
that initially travels in a direction intersecting the axis at the focal point 
R/2. 

P9.9 Perform an analysis similar to example 9.8 for the virtual image formed 
by the positive lens in Fig. 9.24. 

P9.10 Perform an analysis similar to example 9.8 for the virtual image formed 
by the negative lens in Fig. 9.25. 




Figure 9.24 Formation of a virtual 
image by a thin lens. 




Figure 9.25 Formation of a virtual 
image by a thin lens with negative 
focal length. 



Exercises for 9. 7 Principal Planes for Complex Optical Systems 

P9.ll A complicated lens element is represented by an ABCD matrix. An 
object placed a distance d\ before the unknown element causes an 
image to appear a distance ^2 after the unknown element. 

Suppose that when d\ = £, we find that ^2 = 2£. Also, suppose that 
when d\ = 2(, we find that d2 = 3£/2 with magnification -1/2. What is 
the ABCD matrix for the unknown element? 

HINT: Use the conditions for an image (9.58) and (9.59). If the index 
of refraction is the same before and after, then (9.65) applies. HINT: 
First find linear expressions for A, B, and C in terms of D. Then put the 
results into (9.65). 
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Figure 9.26 
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Figure 9.28 
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Figure 9.29 



P9.12 (a) Consider a lens with thickness d = 5 cm, i?j = 5 cm, i?2 = _ 10 cm, 
n = 1.5. Compute the ABCD matrix of the lens. HINT: See P9.6. 

(b) Where are the principal planes located and what is the effective 
focal length / eff for this system? 

L9.13 Deduce the positions of the principal planes and the effective focal 
length of a compound lens system. Reference the positions of the 
principal planes to the outside ends of the metal hardware that encloses 
the lens assembly, (video) 

HINT: Obtain three sets of distances to the object and image planes 
and place the data into (9.58) to create three distinct equations for the 
unknowns A, B, C, and D. Find A, B, and C in terms of D and place the 
results into (9.65) to obtain the values for A, B, C, and D. The effective 
focal length and principal planes can then be found through (9.61)- 
(9.63). 

P9.14 Use a computer program to calculate the ABCD matrix for the com- 
pound system shown in Fig. 9.29, known as the "Tessar lens." The 
details of this lens are as follows (all distances are in the same units, 
and only the magnitude of curvatures are given — you decide the sign): 
Convex-convex lens 1 (thickness 0.357, R\ = 1.628, R 2 = 27.57, n = 
1.6116) is separated by 0.189 from concave-concave lens 2 (thickness 
0.081, Rx = 3.457, R 2 = 1.582, n = 1.6053), which is separated by 0.325 
from plano-concave lens 3 (thickness 0.217, R\ = 00, R 2 = 1.920, n = 
1.5123), which is directly followed by convex-convex lens 4 (thickness 
0.396, Ri = 1.920, R 2 = 2.400, n = 1.6116). 

HINT: You can reduce the number of matrices you need to multiply by 
using the "thick lens" matrix. 



Exercises for 9.8 Stability of Laser Cavities 



P9.15 (a) Show that the cavity depicted in Fig. 9. 16c is stable if 



0< 



L \ ( L 
1 

«2 



Ri 



< 1 



(b) The two concave mirrors have radii Ry = 60 cm and R 2 = 100 cm. 
Over what range of mirror separation L is it possible to form a stable 
laser cavity? 

HINT: There are two different stable ranges with an unstable range 
between them. 



P9.16 



Find the stable ranges for L\-L 2 -L for the laser cavity depicted in 
Fig. 9.16d with focal length / = 50 cm. 
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L9.17 Experimentally determine the stability range of a HeNe laser with ad- 
justable end mirrors. Check that this agrees reasonably well with theory. 
Can you think of reasons for any discrepancy? (video) 



Figure 9.30 
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Diffraction 



In the 1600's, Christian Huygens developed a wave description for light. Unfor- 
tunately, his ideas were largely overlooked at the time because Sir Isaac Newton 
promoted a competing theory. Newton proposed that light should be thought 
of as many tiny bullets, or corpuscles, as he called them. Newton's ideas pre- 
vailed for more than a century, perhaps because he was right on so many other 
things, until 1807 when Thomas Young performed his famous two-slit experiment, 
conclusively demonstrating the wave nature of light. Even then, Young's conclu- 
sions were accepted only gradually by others, a notable exception being a young 
Frenchman named Augustin Fresnel. The two formed a close friendship through 
correspondence, and it was Fresnel that followed up on Young's conclusions and 
dedicated his life to a study of light. 

Fresnel's skill as a mathematician allowed him to transform physical intuition 
into powerful and concise ideas. Perhaps Fresnel's greatest accomplishment was 
the adaptation of Huygens' principle of wavelet superposition into a mathematical 
formula. Ironically, he used Newton's calculus to achieve this. Huygens' principle 
asserts that a wave front can be thought of as many wavelets, which propagate and 
interfere to form new wave fronts. This is illustrated in Fig. 10.1. The phenomenon 
of diffraction is then understood as the spilling of wavelets around obstructions 
in the path of light. 

After formulating Huygens' principle as a diffraction integral, Fresnel made 
an approximation to his own formula, called the Fresnel approximation, for the 
sake of making the integration easier to perform. As far as approximations go, 
the Fresnel approximation is surprisingly accurate in describing the light field 
in the region down stream from an aperture. The diffraction pattern can evolve 
in complicated ways as the distance from an aperture increases. At distances far 
down stream from an aperture, the diffraction pattern acquires a final form that 
no longer evolves, other than to grow in proportion to distance. This far-field 
limit is often of interest, and it turns out that the Fresnel diffraction formula can 
be simplified further in this case. The far-field limit of the Fresnel diffraction 
formula is called the Fraunhofer approximation. 

From the modern perspective, Fresnel's diffraction formula needs justifica- 




Christiaan Huygens (1629-1695, 
Dutch) was born in The Hague, Nether- 
lands. His father was friends with the 
mathematician Rene Descartes, which 
probably influenced his upbringing. Huy- 
gens studied law and mathematics at 
the University of Leiden, which preceded 
a very productive career as a scientist 
and mathematician. During mid career, 
Huygens held a position in the French 
Academy of Sciences in Paris for 15 
years, but spent the majority of his life 
in The Hague. Huygens was the first 
to advocate the wave theory of light. 
He was able to explain birefringence in 
terms of his wave theory together with 
a refractive index that varied with direc- 
tion. Huygens constructed a telescope 
with which he discovered Saturn's moon 
Titan. He also made the first detailed 
observations of the Orion nebula. Huy- 
gens made significant advancements in 
clock-making technology and wrote a 
book on probability theory. Huygens 
was one of the earliest science-fiction 
writers and speculated that life exists on 
other planets in his book Cosmotheoros. 
(Wikipedia) 
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tion starting from Maxwell's equation. The diffraction formula is based on scalar 
diffraction theory, which ignores polarization effects. In some situations, ignor- 
ing polarization is benign, but in other situations ignoring polarization effects 
produces significant errors. These issues as well as the approximations leading to 
scalar diffraction theory are discussed in section 10.2. 



10.1 Huygens' Principle as Formulated by Fresnel 




Figure 10.1 Wave fronts depicted 
as a series of Huygens' wavelets. 
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Figure 10.2 



In this section we discuss the calculus of summing up the contributions from the 
many wavelets originating in an aperture illuminated by a light field. Each point 
in the aperture is thought of as a source of a spherical wavelet. 1 In our modern 
notation, such a spherical wave can be written as proportional to e lkR IR, where R 
is the distance from the source. As a spherical wave propagates, its strength falls 
off in proportion to the distance traveled and the phase is related to the distance 
propagated, similar to the phase of a plane wave. 

You should be aware that a spherical wave of the form e lkR IR (even if some 
sort of polarization is attached) is a poor solution to Maxwell's equations (see 
P10.2). It utterly fails near R = 0. However, if R is much larger than a wavelength, 
this spherical wave starts to approximate actual solutions to Maxwell's equations. 
It is within this regime that the diffraction formula derived here is successful. It 
should be noted that by choosing k, we consider only a single wavelength of light 
(i.e. one frequency) . 

Consider an aperture or opening in an opaque screen in the plane z = 0. Let 
the aperture be illuminated with a light field distribution E{x', y', z = 0) within 
the aperture. Then for a point [x, y, z) lying somewhere after the aperture (z > 0) 
the net field is given by adding together wavelets emitted from each point in the 
aperture. 

Each spherical wavelet takes on the strength and phase of the field at the 
point where it originates. Mathematically, this summation takes the form 



E(x,y,z) 



i 

A 



JkR 



E{x' ,y' ,0)—^-dx'dy 



aperture 



where 



R= yl( X - X iy + (y-yiy + Z 



(10.1) 



(10.2) 



is the radius of each wavelet as it individually intersects the point (x, y, z). The 
factor -il X in front of the integral in (10.1) ensures the right phase and field 
strength (not to mention units) . Justification for this factor is given in section 
10.3 and in appendix 10.A. To summarize, (10.1) tells us how to compute the field 



^or simplicity, we use the term 'spherical wave' in this book to refer to waves of the type 
imagined by Huygens (i.e. of the form e' kR IR). There is a different family of waves based on 
spherical harmonics that are also sometimes referred to as spherical waves. These waves have 
angular as well as radial dependence, and they are solutions to Maxwell's equations. See J. D. 
Jackson, Classical Electrodynamics, 3rded., pp. 429-432 (New York: John Wiley, 1999). 
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down stream given knowledge of the field in an aperture. The field at each point 
(jc', y') in the aperture, which may vary with strength and phase, is treated as the 
source for a spherical wave. The integral in (10.1) sums the contributions for all 
of these wavelets. 



Example 10.1 

Find the on-axis 2 (i.e. x, y = 0) intensity following a circular aperture of diameter I 
illuminated by a uniform plane wave. 

Solution: The diffraction integral (10.1) takes the form 

I en Jk\J x lZ +y' 2 +z 2 

B(0,0,z) = - T Ejx',y',0)—= dx'dy' 

A JJ J X K + y'2 + z 2 

aperture 

The circular hole encourages a change to cylindrical coordinates: x' — p' cos</>' and 
y' = p' sin</>'; dx'dy' — ► p' dp' dip' . In this case, the limits of integration define of 
the geometry of the aperture, and the integration is accomplished as follows: 



£(0,0,Z): 



IE, 



2n (12 . 



rp' dp' 



+ Z A 



-271 

A 



ik 



(12 



= -E \e 



ik\/((l2) 



" r ? 2 _ gikz 



The on axis intensity becomes 

I (0, 0, z) oc E (0, 0, z) E* (0, 0, z) = |£ | 2 [ e iks ^^ - e ikz \ [ e"'' 1 ^ - e~ ikz 



= 2\E 



1 - cos \k\J{(l2) 2 + z 2 - kz 



(10.3) 



See problem PI 0.6 for a graph of this function. 



When an aperture has a complicated shape, it may be convenient to break up 
the diffraction integral (10.1) into several pieces. You are probably already used 
to doing this sort of piecewise approach to integration in other settings. It seems 
hardly worth giving a name to this technique, but it is called Babinet's principle; 
perhaps in Babinet's day people were not as comfortable with calculus. 

As an example of how to use Babinet's principle, suppose that we have an 
aperture that consists of a circular obstruction within a square opening as de- 
picted in Fig. 10.4. Thus, the light transmits through the region between the circle 
and the square. One can evaluate the overall diffraction pattern by first evaluating 
the diffraction integral for the entire square (ignoring the circular block) and then 
subtracting the diffraction integral for a circular opening having the shape of the 



£(0,0, z) 




Figure 10.3 Circular aperture illu- 
minated by a plane wave. 




Figure 10.4 Aperture comprised of 
the region between a circle and a 
square. 



An analytical solution is not possible off axis. 
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Block 



Figure 10.5 A block in a plane 
wave giving rise to diffraction in 
the geometric shadow. 



block. This removes the unwanted part of the previous integration and yields the 
overall result. It is important to add and subtract the integrals (i.e. fields), not 
their squares (i.e. intensity). 

As trivial as Babinet's principle may seem to you, it may not be obvious at 
first that Babinet's principle also applies to an infinitely wide plane wave that 
is interrupted by finite obstructions. In this case, one computes the diffraction 
of the blocked portions of the field as though these portions were openings in a 
mask. This result is then subtracted from the plane wave (no integration needed 
for the plane), as depicted in Fig. 10.5. 

When Fresnel first presented his diffraction formula to the French Academy 
of Sciences, a certain judge of scientific papers named Simeon Poisson noticed 
that Fresnel's formula predicted that there should be light in the center of the 
geometric shadow behind a circular obstruction. This seemed so absurd to 
Poisson that he initially disbelieved the theory, until the spot was shortly thereafter 
experimentally confirmed, much to Poisson's chagrin. Needless to say, Fresnel's 
paper was then awarded first prize, and this spot appearing behind circular blocks 
has since been known as Poisson's spot. 



Example 10.2 

Find the on-axis (i.e. x, y = 0) intensity behind a circular block of diameter £ placed 
in a uniform plane wave. 

Solution: From Example 10.1, the on-axis field behind a circular aperture is 
E Q \e lkz - e ifc vV/2) 2 +z 2 j_ Babinet's principle says to subtract this result from a plane 
wave to obtain the field behind the circular block. The situation is depicted in 
Fig. 10.5 (side view). The on-axis field is then 



E (0, 0, z) = E e ikz - E \ e ikz - g'Vc'/z) 2 -^ 



■ E e 



ik\/(e/2) 2 + z 2 



The on axis intensity becomes 

I (0, 0, z) oc E (0, 0, z) E* (0, 0, z) = |£ | 2 e ^^fwW^\-iksfwW^ = ^2 

This result says that, in the exact center of the shadow behind a circular obstruction, 
the intensity is the same as the illuminating plane wave for all distance z. A spot of 
light in the center forms right away; no wonder Poisson was astonished! 



10.2 Scalar Diffraction Theory 

In this section we provide the background motivation for Huygen's principle and 
Fresnel's formulation of it. Consider a light field with a single frequency a). The 
light field can be represented by E (r) e~ lblt , and the time derivative in the wave 
equation (2.13) can be easily performed. It reduces to 

V 2 E(r) + fc 2 E(r) = (10.4) 



10.2 Scalar Diffraction Theory 



261 




where k = na)/cis the magnitude of the usual wave vector (see also (9.2)). Equa- 
tion (10.4) is called the Helmholtz equation. Again, it is merely the wave equation 
written for the case of a single frequency where the trivial time dependence has 
been removed. To obtain the full wave solution, just append the factor e~' Mt to 
the solution of the Helmholtz equation E (r). 

At this point we take an egregious step: We ignore the vectorial nature of E(r) 
and write (10.4) using only the magnitude E{r). When using scalar diffraction 
theory, we must keep in mind that it is based on this serious step. Under the 
scalar approximation, the vector Helmholtz equation (10.4) becomes the scalar 
Helmholtz equation: 



V / £(r) + fc z £(r) = 



(10.5) 



This equation of course is consistent with (10.4) in the case of a plane wave. 
However, we are interested in spherical waves of the form E{r) = E Q r Q e ikr /r. It 
turns out that such spherical waves are exact solutions to the scalar Helmholtz 
equation (10.5). The proof is left as an exercise (see P10.3). Nevertheless, spherical 
waves of this form only approximately satisfy the vector Helmholtz equation (10.4) . 
We can get away with this sleight of hand if the radius r is large compared to a 
wavelength (i.e., kr » 1) and if we restrict r to a narrow range perpendicular to 
the polarization. 

Significance of the Scalar Wave Approximation 

The solution of the scalar Helmholtz equation is not completely unassociated with 
the solution to the vector Helmholtz equation. In fact, if ^scalar M obeys the scalar 
Helmholtz equation (10.5), then 

E(r) = rxV£ scalar (r) (10.6) 

obeys the vector Helmholtz equation (10.4). 

Consider a spherical wave, which is a solution to the scalar Helmholtz equation: 



E SC3laI {T) = E r (> e ikr lr 



(10.7) 



Remarkably, when this expression is placed into (10.6) the result is zero. Although 
zero is in fact a solution to the vector Helmholtz equation, it is not very interesting. 
A more interesting solution to the scalar Helmholtz equation is 



^scalar ( r ) — f oE 1 



kr 



COS0 



(10.8) 



which is one of an infinite number of unique 'spherical' solutions that exist. Notice 
that in the limit of large r, this expression looks similar to (10.7), aside from the 
factor cos0. The vector form of this field according to (10.6) is 



E(r) = -0r o £ o 1 



kr 



n ikr 



(10.9) 



Francois Jean Dominique Arago 

(1786-1853, French) was born in Cata- 
lan France, where his father was the 
Treasurer of the Mint. As a teenager, 
Arago was sent to a municipal college 
in Perpignan where he developed a 
deep interest in mathematics. In 1803, 
he entered the Ecole Polytechnique in 
Paris, where he purportedly was dis- 
appointed that he was not presented 
with new knowledge at a higher rate. 
He associated with famous French 
mathematicians Simeon Poisson and 
Pierre-Simmon Laplace. He later worked 
with Jean-Baptiste Biot to measure 
the meridian arch to determine the 
exact length of the meter. This work 
took him to the Balearic Islands, Spain, 
where he was imprisoned as a spy, be- 
ing suspected because of lighting fires 
atop a mountain as part of his survey- 
ing efforts. After a heroic prison escape 
and a subsequent string of misfortunes, 
he eventually made it back to France 
where he took a strong interest in op- 
tics and the wave theory of light. Arago 
and Fresnel established a fruitful collab- 
oration that extended for many years. 
It was Arago who demonstrated Pois- 
son's spot (sometimes called Arago's 
spot). Arago also invented the first po- 
larizing filter. In later life, he served a 
brief stint as the French prime minister. 
(Wikipedia) 



This field looks approximately like the scalar spherical wave solution (10.7) in the 
limit of large r if the angle is chosen to lie near 9 = n!2 (spherical coordinates). 
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z = 



Figure 10.6 



Since our use of the scalar Helmholtz equation is in connection with this spherical 
wave under these conditions, the results are close to those obtained from the 
vector Helmholtz equation. 



Fresnel developed his diffraction formula (10. 1) a half century before Maxwell 
assembled the equations of electromagnetic theory. In 1887, Gustav Kirchhoff 
demonstrated that Fresnel's diffraction formula satisfies the scalar Helmholtz 
equation. In doing this he clearly showed the approximations implicit in the 
theory, and made a slight revision to the formula: 



1 + cos(R,z) 



dx'dy' 



(10.10) 



aperture 



The factor in square brackets, Kirchhoff 's revision, is known as the obliquity factor. 
Here, cos(R,z) indicates the cosine of the angle between R and z. Notice that this 
factor is approximately equal to one when the point (x, y, z) is chosen to be in 
the forward direction; we usually study diffraction under this circumstance. On 
the other hand, the obliquity factor equals zero for fields traveling in the reverse 
direction (i.e. in the -z direction). This fixes a problem with Fresnel's version of 
the formula (10.1) based on Huygens' wavelets, which suggested that light could 
as easily diffract in the reverse direction as in the forward direction 

In honor of Kirchhoff 's work, (10.10) is referred to as the Fresnel-Kirchhoff 
diffraction formula. The details of Kirchhoff 's more rigorous derivation, including 
how the factor -HX naturally arises, are given in Appendix 10.A. Since the Fresnel- 
Kirchhoff formula can be understood as a superposition of spherical waves, it is 
not surprising that it satisfies the scalar Helmholtz equation (10.5). 



10.3 Fresnel Approximation 

Although the Fresnel-Kirchhoff integral looks innocent enough, it is actually quite 
difficult to evaluate analytically. It is problematic even if the field E [x 1 , y', z = 0) is 
constant across the aperture and if the obliquity factor (1 + cos (r,z))/2 is approxi- 
mated as one (i.e. forward direction). 

Fresnel introduced an approximation 3 to his diffraction formula that makes 
the integration somewhat easier to perform. The approximation is analogous 
to the paraxial approximation made for rays in chapter 9. Thus, the Fresnel 
approximation requires the avoidance of large angles with respect to the z-axis 

Besides letting the obliquity factor be one, Fresnel approximated R by the 
distance z in the denominator of (10.10) . Then the denominator can be brought 
out in front of the integral since it no longer depends on x' and y' . This is valid to 
the extent that we restrict ourselves to small angles: 

R = z (denominator only; Fresnel approximation) (10.11) 

3 J. W. Goodman, Introduction to Fourier Optics, Sect. 4- 1 (New York: McGraw-Hill, 1968) . 
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The above approximation is wholly inappropriate in the exponent of (10. 10) since 
small changes in R can result in dramatic variations in the periodic function e . 
To approximate R in the exponent, we must proceed with caution. To this end 
we expand (10.2) under the assumption z 2 » [x - x') 2 + [y- y') 2 . Again, this is 
consistent with the idea of restricting ourselves to relatively small angles. The 
expansion of (10.2) is written as 



/ {x-x') 2 + [y-y'f 

r = z\ i + y = z 



{x-x'\ + {y-y) 

1 + - — ^ — — + • 

2z 2 



(exponent; Fresnel approximation) (10.12) 
Substitution of (10.11) and (10.12) into the Fresnel diffraction formula (10.1) 
yields 



E[x,y,z) 



le ikz e i±(x 2 +y 2 ) 



Xz 



ff E {%', y', 0) e ! 's C^+X 8 ) e~ l * dx'dy' 



aperture 



(Fresnel approximation) (10.13) 
This is Fresnel's approximation to his diffraction integral formula. It may look a bit 
messier than before, but in terms of being able to make progress on integration 
we are better off than previously. Notice that the integral can be interpreted as a 
two-dimensional Fourier transform on E(x',y',0)e l & ( x +y \ 



Example 10.3 

Compute the Fresnel diffraction field following a rectangular aperture (dimensions 
Ax by Ay) illuminated by a uniform plane wave. 



Solution: According to (10.13), the field down stream is 



e ikz ■ k , 2 
E{x,y,z) = -iE — e l TA x + 

AZ 



Ax/2 



Ay/2 



-Ax/2 



7 

-Ay/2 



■ o 2z •> o z y 



dy'e'^ y e 



Unfortunately, the integration in the preceding example must be performed 
numerically. This is often the case for diffraction integrals in the Fresnel approx- 
imation. Figure 10.7 shows the result of such an integration for a rectangular 
aperture with a height twice its width. 

Paraxial Wave Equation 

If we assume that the light coming through the aperture is highly directional, such 
that it propagates mainly in the z-direction, we are motivated to write the field 
as E{x,y,z) = E{x, y, z)e lkz . Upon substitution of this into the scalar Helmholtz 
equation (10.5), we arrive at 




z= 100 /fc 



640 Ik 



z = 500/fc 



640 k 



z = 2500/fc 



640 jk 



Figure 10.7 Field amplitude fol- 
lowing a rectangular aperture 
computed in the Fresnel approxi- 
mation. 
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At this point we make the paraxial wave approximation, 4 which is |2fc|| | » | |-§ | . 
That is, we assume that the amplitude of the field varies slowly in the z-direction 
such that the wave looks much like a plane wave. We permit the amplitude to 
change as the wave propagates in the z-direction as long as it does so on a scale 
much longer than a wavelength. This leads to the paraxial wave equation: 




d 2 d 2 d , 

+ — x +2ik — E{x,y,z) \ 
dy l dz 



dx 2 







(paraxial wave equation) (10.15) 



It turns out that the Fresnel approximation (10.13) is an exact solution to the 
paraxial wave equation. As demonstrated in problem P10.5, (10.15) is satisfied by 



E{x, y, z) = 



A.z 



(10.16) 



When the factor e lkz is appended, this field is identical to (10.13). 



Joseph von Fraunhofer (1787-1826, 
German) was born in Straubing, Bavaria. 
He was orphaned at age 11, whereupon 
he was apprenticed to a glassmaker. 
The workshop collapsed, trapping him 
in the rubble. The Prince of Bavaria 
directed the rescue efforts and thereafter 
took an interest in Fraunhofer's educa- 
tion. The prince required the glassmaker 
to allow young Joseph time to study, 
and he naturally took an interest in 
optics. Fraunhofer later worked at the 
Optical Institute at Benediktbeuern, 
where he learned techniques for making 
the finest optical glass in his day. Fraun- 
hofer developed numerous glass recipes 
and was expert at creating optical de- 
vices. Fraunhofer was the inventor of 
the spectroscope, making it possible to 
do quantitative spectroscopy. Using his 
spectroscope, Fraunhofer was the first 
to observe and document hundreds of 
absorption lines in the sun's spectrum. 
He also noticed that these varied for 
different stars, thus establishing the field 
of stellar spectroscopy. He was also the 
inventor of the diffraction grating. In 
1822, he was granted an honorary doc- 
torate from the University of Erlangen. 
Fraunhofer passed away at age 39, per- 
haps due to heavy-metal poisoning from 
glass blowing. (Wikipedia) 



1 0.4 Fraunhofer Approximation 

An additional approximation to the diffraction integral was made famous by 
Joseph von Fraunhofer. The Fraunhofer approximation is the limiting case of 
the Fresnel approximation when the field is observed at a distance far after the 
aperture (called the far field). A diffraction pattern continuously evolves along the 
z-direction, as described by the Fresnel approximation. Eventually it evolves into 
a final diffraction pattern that maintains itself as it continues to propogate (al- 
though it increases its size in proportion to distance). It is this far-away diffraction 
pattern that is obtained from the Fraunhofer approximation. Since the Fresnel 
approximation requires the angles to be small (i.e. the paraxial approximation), 
so does the Fraunhofer approximation. 

In many textbooks, the Fraunhofer approximation is presented first because 
the formula is easier to use. However, since it is a special case of the Fresnel 
approximation, it logically should be discussed afterwards as we are doing here. 
To obtain the diffraction pattern very far after the aperture, we make the following 
approximation 



.5 



e'2 



1 



(far field) (10.17) 



The validity of this approximation depends on a comparison of the size of the 
aperture to the distance z where the diffraction pattern is observed. We need 



z » — (aperture radius) 



(condition for far field) (10.18) 



4 E W. Milonni and I. H. Eberly, Laser, Sect. 14.4 (New York: Wiley, 1988). 

3 J. W. Goodman, Introduction to Fourier Optics, p. 61 (New York: McGraw-Hill, 1968). 
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By removing the factor (10.17) from (10.13), we obtain the Fraunhofer diffrac- 
tion formula: 



E [x, y, z) = ■ 



ie ikz e i±{x 2 +y 2 ) 



Xz 



JJ E{x',y',0)e~ ik ^ xx ' +yy ^dx'dy' (10.19) (Fraunhofer approximation) 



aperture 



Obviously, the removal of e l ^ ( x 2+y 2 ) from the integrand improves our chances of 
being able to perform the integration. Notice that the integral can now be inter- 
preted as a two-dimensional Fourier transform on the aperture field E [x' , y', 0). 

Once we are in the Fraunhofer regime, a change in z is not very interesting 
since it appears in the combination x/z or y/z inside the integral. At a larger 
distance z, the same diffraction pattern is obtained with a proportionately larger 
value of x or y. The Fraunhofer diffraction pattern thus preserves itself indefinitely 
as the field propagates. It grows in size as the distance z increases, but the angular 
size defined by x/z or yl z remains the same. 

Example 10.4 

Compute the Fraunhofer diffraction pattern following a rectangular aperture (di- 
mensions Ax by Ay) illuminated by a uniform plane wave. 



Solution: According to (10.19), the field down stream is 

Ax/2 

dx'e~' 

Aa/2 -Ay/2 



E(x,y,z) = -iE^e^^ 



f dx'e-^*' f 



Ay/2 

kx v i C , . ky , 

dy'e- l - y 



It is left as an exercise (see P10.8) to perform the integration and compute the 
intensity. The result turns out to be 



I[x,y,z) = I, 



Ax 2 Ay 2 ._ 2 (tiAx 



X 2 z 2 



- sine 



\ Xz 



a I sine 2 1 7 ^-y 
Xz 



(10.20) 



where sinc£ = sin£ /£. Note that limsinc£ = 1. 

f-0 



1 

y _ X 

H 



x _ X 1 

z Ax 



Figure 10.8 Fraunhofer diffraction 
pattern (field amplitude) gener- 
ated by a uniformly illuminated 
rectangular aperture with a height 
twice the width. 



10.5 Diffraction with Cylindrical Symmetry 

Sometimes the field transmitted by an aperture is cylindrically symmetric. In this 
case, the field at the aperture can be written as 

E{x',y',z = 0) = £(p',z = 0) (10.21) 

where p = \J x 2 + y 2 . Under cylindrical symmetry, the two-dimensional integra- 
tion over x' and y' in (10.13) or (10.19) can be reduced to a single-dimensional 
integral over a cylindrical coordinate p'. With the coordinate transformation 



x = pcos([> yspsin(/> x' = p'cos(p' y' = p' sin </>' (10.22) 
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Figure 10.9 Field amplitude fol- 
lowing a circular aperture com- 
puted in the Fresnel approxima- 
tion. 



the Fresnel diffraction integral (10.13) becomes 

. , .kp 2 2n 

E{p,z) = — / d(p' \ p'dp'E[p',0)e l — 



-ij(pp'cos 0cos </>' + pp' sin </isin (p 1 ) 



aperture 

Notice that in the exponent of (10.23) we can write 

p'p(cos(//cos(/> + sin(//sin(/>) = p'p cos [c/)' -(pi) 
With this simplification, the diffraction formula (10.23) can be written as 



(10.23) 
(10.24) 



., . kp* In 

E(p,z) = - ie ' ^ " f p'dp'E(p',Q)e ik -£ f dje-i^W (10.25) 



aperture 

We are able to perform the integration over </>' with the help of the formula (0.57): 

2m 

,-^co S (^)^ = 27r/o fW] (1Q26) 





/ 



Jo is called the zero-order Bessel function. Equation (10.25) then reduces to 



E{p,z) = 



., . kp L 

2nie llcz e l 2 * 
Xz 



f p'dp'E{p',Q) 



e 2 * Jo 



V z I 



aperture 

(Fresnel approximation with cylindrical symmetry) (10.27) 

. kp' 2 

The integral in (10.27) is called a Hankel transform on E [p , 0) e l 2 * . 

In the case of the Fraunhofer approximation, the diffraction integral becomes 
a Hankel transform on just the field E[p',z = 0) since exp|/^-j goes to one. 
Under cylindrical symmetry, the Fraunhofer approximation is 



E{p,z) 



., .kpf_ 

2nie llcz e l 2 * 
Xz 



f p'dp'E(p',0)j 



(kpp' 



aperture 

(Fraunhofer approximation with cylindrical symmetry) (10.28) 
Just as fast Fourier transform algorithms aid in the numerical evaluation of diffrac- 
tion integrals in Cartesian coordinates, fast Hankel transforms also exist and can 
be used with cylindrically symmetric diffraction integrals. 

Example 10.5 

Compute the Fresnel and Fraunhofer diffraction patterns following a circular 
aperture (diameter £) illuminated by a uniform plane wave. 

Solution: According to (10.27), the field down stream is 



E[p,z) = -iE 



2ne ikz e i iz 
Xz 



ip 2 en 



J P'dp 



T (kpp' 



I z 
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Unfortunately, this Fresnel integral must be performed numerically. The result 
of the calculation for a uniform field illuminating a circular aperture is shown in 
Fig. 10.9. 

On the other hand, the field in the Fraunhofer limit (10.28) is 



E[p,z) = -iE 



., .kp^ en 

2ne ltcz e l 2z f , , I kpp' 



Xz 



{ 



p dp Jo 



which can be integrated analytically. It is left as an exercise to perform the integra- 
tion and to show that the intensity of the Fraunhofer pattern is 



2 - J r h[k£p/2z) 



(k(pl2z) 



(10.29) 



The function ^ (sometimes called the jinc function) looks similar to the sine 
function (see Example 10.4) except that its first zero is at £ = \.22n rather than at n. 
Note thatlim^ 1 = 1- 



Figure 10.10 Fraunhofer diffrac- 
tion pattern (field amplitude) gen- 
erated for a uniformly illuminated 
circular aperture. 



Appendix 1 0.A Fresnel-Kirchhoff Diffraction Formula 



To begin the derivation of the Fresnel-Kirchhoff diffraction formula, 6 we employ 
Green's theorem (proven in appendix 10.B): 



dV dU 
U—-V— 
on on 



da = J [UV 2 V-VV 2 U]dv (10.30) 



The notation dldn implies a derivative in the direction normal to the surface. We 
choose the following functions: 



V=e ikr lr 
U = E{r) 



(10.31) 



where E(r) is assumed to satisfy the scalar Helmholtz equation, (10.5). When 
these functions are used in Green's theorem (10.30), we obtain 



q e ikr e ikr g E 



dn r 



r dn 



da 



-} 



Jkr „ikr 



dv 



(10.32) 



The right-hand side of this equation vanishes 7 since we have 



,ikr gikr gikr gikr 

V 2 E = -k 2 E + k 2 E = 



(10.33) 



See J. W. Goodman, Introduction to Fourier Optics, Sect. 3-3 (New York: McGraw-Hill, 1968). 
7 We exclude the point r = 0; see P0.4 and P0.5. 
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where we have taken advantage of the fact that E (r) and e lkr It both satisfy (10.5). 
This is exactly the reason for our judicious choices of the functions V and U since 
with them we were able to make half of (10.30) disappear. We are left with 



d e 



ikr 



Jkr 



dE 



dn r 



r dn 



da = 



(10.34) 




Figure 10.1 1 A two-part surface 
enclosing volume V. 



Now consider a volume between a small sphere of radius e at the origin and an 
outer surface of whatever shape. The total surface that encloses the volume is 
comprised of two parts (i.e. S = Si + S2 as depicted in Fig. 10.11). 
When we apply (10.34) to the surface in Fig. 10.1 1, we have 



d e 



ikr 



dn r 



,ikr g E 

r dn 



da ■ 



■i 

Si 



d e ikr e ikr dE 



dn r 



r dn 



da 



(10.35) 



Our motivation for choosing this geometry with multiple surfaces is that eventu- 
ally we want to find the field at the origin (inside the little sphere) from knowledge 
of the field on the outside surface. To this end, we assume that e is so small that 
E (r) is approximately the same everywhere on the surface S\ . Then the integral 
over Si becomes 



ikr 



d e 
dn r 



e ikr dE 
r dn 



da = lim 

r= 



lit it 



a 


e ikr ' 


dr 


e ikr (dE\ 


dr 


dr 


r J 


dn 


r \dr) 


dn 



r sin(f>d(p 

(10.36) 



where we have used spherical coordinates. Notice that we have employed the 
chain rule to execute the normal derivative dldn. Since r always points opposite 
to the direction of the surface normal n, the normal derivative drldn is always 
equal to -l. 8 We can now perform the integration in (10.36) as well as take the 
limit as e — to obtain 



lim 





Si 



d e ikr e ikr dE 
dn r r dn 



da = - An lim 

e— 



= -An lim 

e— 
= 47r£'(0) 



I e 'kr 



ikr 1 



■ + ik- 



„ikr 



B-r c — 



dE) 



-e ike +ikee ike ]E-e ikc e 



r \ dr 

dE 
~dr 



(10.37) 



With the aid of (10.37), Green's theorem applied to our specific geometry 
reduces to 

e ikr dE d e ikr 



E(0) = 



An J 



r dn dn r 



da 



(10.38) 



8 From the definition of the normal derivative we have dr/3ra = Vr-n = -fin = — 1. 
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If we know E everywhere on the outer surface Sz, this equation allows us to predict 
the field E (0) at the origin. Of course we are free to choose any coordinate system 
in order to find the field anywhere inside the surface S2, by moving the origin. 

Now let us choose a specific surface S2. Consider an infinite mask with a finite 
aperture connected to a hemisphere of infinite radius R —- 00. In the end, we 
will suppose that light that enters through the mask and propagates to our origin 
(among other points). In our present coordinate system, the vectors r and n point 
opposite to the incoming light. 

We must evaluate (10.38) on the surface depicted in the figure. For the portion 
of S2 which is on the hemisphere, the integrand tends to zero as R becomes large. 
To argue this, it is necessary to recognize the fact that at large distances the field 
takes on a form proportional to e lkR /R so that the two terms in the integrand 
cancel. On the mask, we assume, as did Kirchhoff, that both dE/dn and E are 
zero. 9 Thus, we are left with only the integration over the open aperture: 



E(0) 



4n SI 



aperture 



Jkr d£ 



d e 



ikr 



r dn dn r 



da 



(10.39) 



We have essentially arrived at the result that we are seeking. The field coming 
through the aperture is integrated to find the field at the origin, which is located 
beyond the aperture. Let us manipulate the formula a little further. The second 
term in the integral of (10.39) can be rewritten as follows: 



d e 



ikr 



dn 



d e 



ikr\ 



dr 



dr 
dn 



<ik 
r 



1 ^ 

72 



e jfcr cos(r,n) 



ike 



ikr 



■cos(r,n) (10.40) 



mask 




Figure 10.12 Surface S2 depicted 
as a mask and a large hemisphere. 



where drldn = cos (r,n) indicates the cosine of the angle between r and n. We 
have also assumed that the distance r is much larger than a wavelength in order 
to drop a term. Next, we assume that the field illuminating the aperture can be 
written as E = E [x, y) e lkz . This represents a plane-wave field traveling through 
the aperture from left to right. Then, we have 



dE dE dz ~ , , fi ._ 

— = — — = ikE x, y e lkz (- 1) = - ikE 

dn dz dn 

Substituting (10.40) and (10.41) into (10.39) yields 



E{0) = 



X 



-ikr 



l + cos(r,n) 



da 



(10.41) 



(10.42) 



aperture 



Finally, we wish to rearrange our coordinate system to that depicted in Fig. 10.2. 
In our derivation, it was less cumbersome to place the origin at a point after the 



9 Later Sommerfeld noticed that these two assumptions actually contradict each other, and he 
revised Kirchhoff's work to be more accurate. In practice this revision makes only a tiny difference 
as light spills onto the back of the aperture, over a length scale of a wavelength. We will ignore 
this effect and go with Kirchhoff's (slightly flawed) assumption. For further discussion see J. W. 
Goodman, Introduction to Fourier Optics, Sect. 3-4 (New York: McGraw-Hill, 1968). 
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aperture. Now that we have completed our mathematics, it is convenient to make 
a change of coordinate system and move the origin to the plane of the aperture as 
in Fig. 10.2. Then, we can obtain the field at a point lying somewhere after the 
aperture by computing 



i rr e ikR 
E{x,y,z = d) = -- II £(x',y',z = 0)-^- 



aperture 



1 + cos(r.z) 



where 



R = yJ{x-x') 2 + {y-yf + d 2 



dx'dy' (10.43) 



(10.44) 



Equation (10.10) is the same as (10.42) after applying a coordinate transformation. 
It is called the Fresnel-Kirchhoff diffraction formula and it agrees with (10.1) 
except for the obliquity factor [1 + cos (r,z)] 12. 



Appendix 10.B Green's Theorem 

To derive Green's theorem, we begin with the divergence theorem (see (0.11)): 

jf-nda = Jv-idv (10.45) 

s v 

The unit vector n always points normal to the surface of volume V over which 
the integral is taken. Let the vector function f be UW, where U and V are both 
analytical functions of the position coordinate r. Then (10.45) becomes 

j{UW)-hda = Jv-iUVV) dv (10.46) 

s v 

We recognize V V • n as the directional derivative of V, directed along the surface 
normal n. This is often represented in shorthand notation as 

dV 

yy.fi s— (10.47) 
on 

The argument of the integral on the right-hand side of (10.46) can be expanded 
with the chain rule: 

V-(tTW) = \7U-\7V+U\7 2 V (10.48) 
With these substitutions, (10.46) becomes 

ju^da = J [VU-VV+UV 2 V]dv (10.49) 



Actually, so far we haven't done much. Equation (10.49) is nothing more than the 
divergence theorem applied to the vector function UW. Similarly, we can apply 
the divergence theorem to an alternative vector function given by the reverse 
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combination VVU. Thus, we can write an equation similar to (10.49) where U 
and V are interchanged: 

jvj^ da = j [VV-VU+VV 2 U]dv (10.50) 

s v 

We subtract (10.50) from (10.49), and this leads to (10.30) known as Green's theo- 
rem. 
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Exercises 




Figure 10.13 



Exercises for 10.1 Huygens' Principle as Formulated by Fresnel 

P10.1 Huygens' principle is often used to describe diffraction through a slits, 
but it can be also used to describe refraction. Use a drawing program 
or a ruler and compass to produce a picture similar to Fig. 10.13, which 
shows that the graphical prediction of refracted angle from the Huy- 
gens' principle. Verify that the Huygens picture matches the numerical 
prediction from Snell's Law for an incident angle of your choice. Use 
ri[ = 1 and n t = 2. 

HINT: Draw the wavefronts hitting the interface at an angle and treat 
each point where the wavefronts strike the interface as the source of 
circular waves propagating into the n = 2 material. The wavelength of 
the circular waves must be exactly half the wavelength of the incident 
light since A = X mc /n. Use at least four point sources and connect the 
matching wavefronts by drawing tangent lines as in the figure. 

P10.2 (a) Show that the function 

a 

f[r) = — cos{kr -cot) 
r 

is a solution to the wave equation in spherical coordinates with only 
radial dependence, 



1 d 

r 2 dr 



df) 1 d 2 f 



dr) 



v 2 dt 2 



Determine what v is, in terms of k and a>. 

(b) If the electric field were a scalar field, we might be done there. 
However, it's a vector field, and moreover it must satisfy Maxwell's 
equations. We know from experience that it's generally transverse, and 
since it's traveling radially let's make a guess that it's oscillating in the (f> 
direction: 

A 

E(r) = — cos {kr -cot)d> 
r 

Show that this choice for E is not consistent with Maxwell's equations. 
In particular: (i) show that it does satisfy Gauss's Law (1.1); (ii) compute 
the curl of E use Faraday's Law (1.3) to deduce B; (iii) Show that this B 
does satisfy Gauss's Law for magnetism (1.2); (iv) but this B it does not 
satisfy Ampere's law (1.4). 

(c) A somewhat more complicated 'spherical' wave 



E(r,</>) 



Asincf) 



cos [kr -cot) sin(fcr- cot) 

kr 
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does satisfy Maxwell's equations. Describe how this wave behaves as 
a function of r and (p. What conditions need to be satisfied for this 
equation to reduce to the spherical wave formula used in the diffraction 
formulas? 



Exercises for 10.2 Scalar Diffraction Theory 

P10.3 Show that E{r) = E r e lkr Ir is a solution to the scalar Helmholtz equa- 
tion (10.5). 

HINT: 



i d 2 



i 



rdr 2 M+ r 2 sin0d8 



d ( dw\ 

sinfl i) 



1 



d 2 y/ 



r 2 sm 2 9 dcp 2 



P10.4 



Learn by heart the derivation of the Fresnel-Kirchhoff diffraction for- 
mula (outlined in Appendix 10.A). Indicate the percentage of how well 
you understand the derivation. If you write 100% percent, it means 
that you can reproduce the derivation after closing your notes. 

P10.5 Check that (10.16) is the solution to the paraxial wave equation (10.15). 



Exercises for 10.4 Fraunhofer Approximation 

P10.6 (a) Repeat Example 10.1 to find the on-axis intensity after a circular 
aperture in both the Fresnel and Fraunhofer approximations. (HINT: 
Use (10.27) and (10.28) to obtain the fields p = 0.) Also make suitable 
approximations directly to (10.3) to obtain the same answers. 

(b) Check how well the Fresnel and Fraunhofer approximations work 
by graphing the three curves (i.e. (10.3) and the curves obtained in part 
(a)) on a single plot as a function of z. Take £ = 10 jum and A = 500 nm. 
To see the result better, use a log scale on the z-axis. 

4 i 
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Figure 10.14 "The Fraunhofer Ap- 
proximation" by Sterling Cornaby 
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Figure 10.15 
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L10.7 (a) Why does the on-axis intensity behind a circular opening fluctuate 
(see Example 10.1) whereas the on-axis intensity behind a circular 
obstruction remains constant (see Example 10.2)? 

(b) Create a collimated laser beam several centimeters wide. Observe 
the on-axis intensity on a movable screen (e.g. a hand-held card) be- 
hind a small circular aperture and behind a small circular obstruction 
placed in the beam, (video) 

(c) In the case of the circular aperture, measure the distance to several 
on-axis minima and check that it agrees with prediction. (See problem 
P10.6.) 



Laser 




Figure 10.16 



P10.8 Calculate the Fraunhofer diffraction field and intensity patterns for a 
rectangular aperture (dimensions Ax by Ay) illuminated by a plane 
wave E . In other words, derive (10.20). 

P10.9 A single narrow slit has a mask placed over it so the aperture function 
is not a square pulse but rather a cosine: i;(x',y',0) = £ cos(x'/L) for 
-LI 2 < x' < 1/2 and E{x',y',0) = otherwise. Calculate the far-field 
(Fraunhofer) diffraction pattern. Make a plot of intensity as a function 
of xkL/2z; qualitatively compare the pattern to that of a regular single 
slit. 



Exercises for 10.5 Diffraction with Cylindrical Symmetry 

P10.10 Calculate the Fraunhofer diffraction intensity pattern (10.29) for a cir- 
cular aperture (diameter £) illuminated by a plane wave E . 



Chapter 1 1 

Diffraction Applications 



In this chapter, we consider a number of practical examples of diffraction. We first 
discuss diffraction theory in systems involving lenses. The Fraunhofer diffraction 
pattern discussed in section 10.4, applicable in the far-field limit, is imaged to the 
focus of a lens when the lens is placed in the stream of light. This has important 
implications for the resolution of instruments such as telescopes or the human 
eye. 

The array theorem, which applies to Fraunhofer limit, is introduced in sec- 
tion 11.3. This theorem is a powerful mathematical tool that enables one to deal 
conveniently with diffraction from an array of identical apertures. One of the 
important uses of the array theorem is in determining Fraunhofer diffraction 
from a grating, since a diffraction grating can be thought of as an array of narrow 
slit apertures. In section 1 1.5, we study the workings of a diffraction spectrometer. 
To find the resolution limitations, one combines the diffraction properties of 
gratings with the Fourier properties of lenses. 

Finally, we consider a Gaussian laser beam to understand its focusing and 
diffraction properties. The information presented here comes up remarkably 
often in research activity. We often think of lasers as collimated beams of light 
that propagate indefinitely without expanding. However, the laws of diffraction 
require that every finite beam eventually grow in width. The rate at which a laser 
beam diffracts depends on its beam waist size. Because laser beams usually have 
narrow divergence angles and therefore obey the paraxial approximation, we can 
calculate their behavior via the Fresnel approximation discussed in section 10.3. 
Appendix 1 LA discusses the ABCD law for Gaussian beams, which is a method 
of computing the effects of optical elements represented by ABCD matrices on 
Gaussian laser beams. 

11.1 Fraunhofer Diffraction Through a Lens 

The Fraunhofer limit corresponds to the ultimate amount of diffraction that 
light in an optical system experiences. As has been previously discussed, the 
Fraunhofer approximation applies to diffraction when the propagation distance 
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< z > 



Figure 11.1 Diffraction in the far 
field. 



n 



from an aperture is sufficiently large (see (10.18) and (10.19)). Mathematically, it 
is obtained via a two-dimensional Fourier transform. The intensity of the far-field 
diffraction pattern is 



Notice that the dependence of the diffraction on x, y, and z comes only 
through the combinations 6 X = x/z and 6 y = y/z. Therefore, the diffraction 
pattern in the Fraunhofer limit is governed by the two angles 8 X and 8 y , and 
the pattern preserves itself indefinitely. As the light continues to propagate, the 
pattern increases in size at a rate proportional to distance traveled so that the 
angular width is preserved. The situation is depicted in Fig. 11.1. 

Recall that in order to use the Fraunhofer diffraction formula we need to 
satisfy z » n (aperture radius) 2 IX (see (10.18)). As an example, if an aperture 
with a 1 cm radius (not necessarily circular) is used with visible light, the light 
must travel more than a kilometer in order to reach the Fraunhofer limit. It 
may therefore seem unlikely to reach the Fraunhofer limit in a typical optical 
system, especially if the aperture or beam size is relatively large. Nevertheless, 
spectrometers, which typically utilize diffraction gratings many centimeters wide, 
depend on achieving the Fraunhofer limit within the confines of a manageable 
instrument box. This is accomplished using imaging techniques. The Fraunhofer 
limit is also important to the performance of other optical instruments that use 
lenses (e.g. a telescope). 

Consider a lens with focal length / placed in the path of light following an 
aperture (see Fig. 11.2). Let the lens be placed an arbitrary distance L after the 
aperture. The lens produces an image of the Fraunhofer pattern at a new location 
d[ following the lens according to the imaging formula (see (9.55)) 



Keep in mind that the lens interrupts the light before the Fraunhofer pattern 
has a chance to form. This means that the Fraunhofer diffraction pattern may 




2 



(11.1) 




Image of the pattern that 
would have appeared at 
infinity 





z 



Figure 1 1.2 Imaging of the Fraunhofer diffraction pattern to the focus of a lens. 
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be thought of as a virtual object a distance z - L to the right of the lens. Since 
the Fraunhofer diffraction pattern occurs at very large distances (i.e. z — ► oo) the 
image of the Fraunhofer pattern appears at the focus of the lens: 

di = f. (11.3) 

Thus, a lens makes it very convenient to observe the Fraunhofer diffraction pat- 
tern even from relatively large apertures. It is not necessary to let the light propa- 
gate for kilometers. We need only observe the pattern at the focus of the lens as 
shown in Fig. 11.2. Notice that the spacing L between the aperture and the lens is 
unimportant to this conclusion. 

Even though we know that the Fraunhofer diffraction pattern occurs at the 
focus of a lens, the question remains as to the size of the image. To find the answer, 
let us examine the magnification (9.56), which is given by 

di 

M= — l — (11.4) 

~{z-L) 

Taking the limit of very large z and employing (1 1.3), the magnification becomes 

/ 

M -— — (11.5) 

z 

This is a remarkable result. When the lens is inserted, the size of the diffraction 
pattern decreases by the ratio of the lens focal length / to the original distance 
z to a far-away screen. Since in the Fraunhofer regime the diffraction pattern is 
proportional to distance (i.e. size oc z), the image at the focus of the lens scales 
in proportion to the focal length (i.e. size oc /). This means that the angular 
width of the pattern is preserved! With the lens in place, we can rewrite (11.1) 
straightaway as 



l(x,y,L + f) = ^ce 



- ff E{x',y',0)e- i H xx ' +y y' ) dx'dy' 



A/ 

aperture 



(11.6) 



which describes the intensity distribution pattern at the focus of the lens. 

Although (11.6) correctly describes the intensity, we cannot easily write the 
electric field since the imaging techniques that we have used do not render the 
phase information. To obtain an expression for the field, it will be necessary to 
employ the Fresnel diffraction formula. In addition, we need to know how a lens 
adjusts the phase fronts of the light passing through it. 

Phase Front Alteration by a Lens 

Consider a monochromatic light field that goes through a thin lens with focal 
length /. In traversing the lens, the wavefront undergoes a phase shift that varies 
across the lens. We will reference the phase shift to that experienced by the light 
that goes through the center of the lens. In the Fig. 11.3, i?i is a positive radius 
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x 2 + y 2 



Figure 11.3 A thin lens, which 
modifies the phase of a field pass- 
ing through. 




Figure 11.4 The phase fronts of a 
plane wave are bent as they pass 
through a lens. 



of curvature, and R2 is a negative radius of curvature, according to our previous 
convention. We take the distances £\ and £2, as drawn, to be positive. 

The light passing through the off-axis portion of the lens experiences less material 
than the light passing through the center. The difference in optical path length is 
[n - 1) {£\ + (2) (see discussion connected with (9.14)). This means that the phase 
of the field passing through the off-axis portion of the lens relative to the phase of 
the field passing through the center is 



A0 =-*(/!- 1) i£i + £ 2 )- 



(11.7) 



The negative sign indicates a phase advance (i.e. same sign as -ait). Since the 
off-axis light travels through less material, the phase of the wave front gets ahead 
of the light traveling through the center of the lens. In (11.7), k represents the wave 
number in vacuum (i.e. 27i7A vac ); since ( \ and (2 correspond to distances outside 
of the lens material. 

We can find expressions for £\ and £2 from the equations describing the spherical 
surfaces of the lens: 



{Ri - £\f + x 2 + y 2 = R 2 



(R 2 + £ 2 f + x 2 + y 2 - 



R: 



(11.8) 



In the Fresnel approximation, which takes place in the paraxial limit, it is appro- 
priate to neglect the terms £ 2 and £\ in comparison with the other terms present. 
Within this approximation, equations (11.8) become 



9 2 
x + y 

~~2Ri 



and £2 = 



9 9 
x + y 

2R 2 



Substitution into (11.7) yields 



A0=-fc(ra-l) 



Ri 



l\ {x 2 + y 2 ) 
R2! 2 



2/ 



(x 2 + y 2 ) 



(11.9) 



(11.10) 



where the focal length of a thin lens / has been introduced according to the lens- 
maker's formula (9.46). 

In summary, the light traversing a lens experiences a relative phase shift given by 

£(x,y,z aftellens ) = E(x,y,z befmelens )e~' I f ( ~ x +y ) (11.11) 

Equation (11.11) introduces a wave-front curvature to the field. For example, if a 
plane wave (i.e. a uniform field E ) passes through the lens, the field emerges with 
a spherical-like wave front converging towards the focus of the lens. 

We compute the diffraction pattern after the lens in three steps, as illustrated 
in Fig. 11.5. First, we use the Fresnel diffraction formula to compute the field 
arriving at the lens. Second, we adjust the phase front of the light passing through 
the lens according to (11.11). Third, we use the field exiting the lens as the input 
for a second Fresnel diffraction integral to find the field at the lens focus. The 
result gives an intensity pattern in agreement with (11.6). It also provides the full 
expression for the field, including its phase. 
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Starting from the known field E [x', y' , 0) at the aperture, we compute the field 
incident on the lens using the Fresnel approximation: 



E{x",y",L) 



jkL p i±{x" 2 +y" 2 ) 



XL 



(11.12) 

(The double primes keep track of distinct variables in sequential diffraction 
integrals.) As mentioned, the field gains a phase factor according to (11.11) upon 
transmitting through the lens. Finally, we use the Fresnel diffraction formula a 
second time to propagate the distance / from the back of the thin lens: 



E[x,y,L + f) 



'-^^11 



A/ 



E^'y^M^r) 



i^,(x" 2 +y" 2 ) -i% (xx"+yy") , \\ , // . . , 

xe 2 f ( y 'e ! y y3 } dx dy (11.13) 



As you can probably appreciate, the injection of (11.12) into (11.13) makes a 
rather long formula involving four dimensions of integration. Nevertheless, two 
of the integrals can be performed in advance of choosing the aperture (i.e. those 
over x" and y"). This is accomplished with the help of the integral formula (0.55) 
(even though in this instance the real part of a is zero). After this cumbersome 
work, (11.13) reduces to 



E{x,y,L + f) = -i 



e ik{L +f ) e iiflx 2 + y 2 ) e -^(* 2 +y 2 ) 



ff Eix'.y'Me-^'^dx'dy' 



(11.14) 



E(x",y",L) E{x",y",L)e'^ 



E(x,y,L + f) 



J* 



E{x',y',Q) f 
z=0 z=L z=L+f 

Figure 11.5 Diffraction from an 
aperture viewed at the focus of a 
lens. 



Notice that at least the integration portion of this formula looks exactly like 
the Fraunhofer diffraction formula! This happened even though in the preceding 
discussion we did not at any time specifically make the Fraunhofer approximation. 
The result (11.14) implies the intensity distribution (11.6) as anticipated. However, 
the phase of the field is also revealed in (11.14). 

In general, the field caries a wave front curvature as it passes through the 
focal plane of the lens. In the special case L = f, the diffraction formula takes a 
particularly simple form: 

E{x',y',L + f)\ L=f = -i^J- fj E{x' ',y ',o)e"''7 (xx ' +yy,) dx 1 ' dy' (11.15) 

When the lens is placed at this special distance following the aperture, the Fraun- 
hofer diffraction pattern viewed at the focus of the lens carries a flat wave front. 



11.2 Resolution of a Telescope 

In the previous section we learned that the Fraunhofer diffraction pattern appears 
at the focus of a lens. This has important implications for telescopes and other 
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Figure 11.6 To resolve distinct im- 
ages at the focus of a lens, the an- 
gular separation must exceed the 
width of the Fraunhofer diffraction 
patterns. 





2n 3n 4n 
x 



Figure 11.7 (a) First-order Bessel 
function, (b) Square of the Jinc 
function. 



optical instruments. In essence, any optical instrument incorporates an aperture, 
limiting the light that enters. If nothing else, the diameter of a lens itself acts 
effectively as an aperture. The pupil of the human eye is an aperture that induces 
a Fraunhofer diffraction pattern to occur at the retina. Cameras have irises which 
aperture the light, again causing a Fraunhofer diffraction pattern to occur at the 
image plane. 

Of course, the focus of the lens is just where one needs to look in order to 
see images of distant objects. The Fraunhofer pattern, which occurs at the focus, 
represents the ultimate amount of diffraction caused by an aperture. This has the 
effect of blurring out features in the image and limiting resolution. This illustrates 
why it is impossible to focus light to a true point. 

Suppose you point a telescope at two distant stars. An image of each star is 
formed in the focal plane of the lens. The angular separation between the two 
images (referenced from the lens) is the same as the angular separation between 
the stars. 1 This is depicted in Fig. 11.6. 

A resolution problem occurs when the Fraunhofer diffraction causes the 
image of each star to blur by more than the angular separation between them. 
In this case the two images cannot be resolved because they 'bleed' into one 
another. 

The Fraunhofer diffraction pattern from a circular aperture was computed 
previously (see (10.29)). At the focus of a lens, this pattern becomes 



I{PJ) = I 



[n£ 2 \ 


2 


Ua/J 





, h[k£p/2f) 

' [kipl2f) 



(11.16) 



where /, the focal length of the lens, takes the place of z in the diffraction formula. 
The parameter ( is its diameter of the lens. This intensity pattern contains the 
first order Bessel function J\, which behaves somewhat like a sine wave as seen in 
Fig. 11.7. The main differences are that the zero crossings are not exactly periodic 
and the function slowly diminishes with larger arguments. The first zero crossing 
(after x = 0) occurs at l.22n. 

The intensity pattern described by ( 1 1 . 1 6) contains the factor 2/i (£ ) , where 
^ represents the combination k£p/2f. As noticed in Fig. 11.7, ]\ (£) goes to zero 
at £ = 0. Thus, we have a zero-divided-by-zero situation when evaluating 2 J\ (£) 
at the origin. This is similar to the sine function (i.e. sin((0/<0, which approaches 
one at the origin. In fact, 2 J\ {£,)!£, is sometimes called the jinc function because it 
also approaches one at the origin. The square of the jinc is shown in Fig. 11.7b. 
This curve is proportional to the intensity described in (11.16). This pattern is 
sometimes called an Airy pattern after Sir George Biddell Airy (English, 1801-1892) 



4n the thin-lens approximation, the ray from either star that traverses the center of the lens (i.e. 
y = 0) maintains its angle: 
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who first described the pattern. As can be seen in Fig. 1 1.7b, the intensity quickly 
drops at larger radii. 

We now return to the question of whether the images of two nearby stars 
as depicted in Fig. 1 1.6 can be distinguished. Since the peak in Fig. 1 1.7b is the 
dominant feature in the diffraction pattern, we will say that the two stars are 
resolved if the angle between them is enough to keep their respective diffraction 
peaks from seriously overlapping. We will adopt the criterion suggested by Lord 
Rayleigh that the peaks are distinguishable if the peak of one pattern is no closer 
than the first zero to the other peak. This situation is shown in Fig. 11.8. 

The angle that corresponds to this separation of diffraction patterns is found 
by setting the argument of (11.16) equal to 1.227T, the location of the first zero: 



kip 
~2f 



= \.22% 



With a little rearranging we have 



1.22A 



/ 



(11.17) 



(11.18) 



Here we have associated the ratio pi f (i.e. the radius of the diffraction pattern 
compared to the distance from the lens) with an angle min . The Rayleigh criterion 
requires that the diffraction patterns be separated by at least this angle before we 
say that they are resolved. 

8 min depends on the diameter of the lens £ as well as on the wavelength of the 
light. Since the angle between the images and the angle between the objects is 
the same, 6> min tells the minimum angle between objects that can be resolved with 
a given instrument. This analysis assumes that the light from the two objects is 
incoherent, meaning the intensities in the image plane add; interferences between 
the two fields fluctuate rapidly in time and average away. 

Example 11.1 

What minimum telescope diameter is required to distinguish a Jupiter- like planet 
(orbital radius 8 x 10 8 km) from its star if they are 10 light-years away? 




-27T -It 



71 271 



Figure 11.8 The Rayleigh criterion 
for a circular aperture. 



Solution: From (11.18) and assuming 500 nm light, we need 

1.22A 1.22(500 x 10 _9 m) 9.5 x 10 15 m 

£ > = —n x = 0.07m 

min (8x 10 n m)/(101y) ly 

This seems like a piece of cake; a telescope with a diameter bigger than 7cm will do 
the trick. However, the vastly unequal brightness of the star and the planet is the 
real technical challenge. The faint diffraction rings in the star's diffraction pattern 
completely swamp the faint signal from the planet. 
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Figure 11.9 Array of identical aper- 
tures. 



11.3 The Array Theorem 

In this section we develop the array theorem, which is used for calculating the 
Fraunhofer diffraction from an array of N identical apertures. We will be using 
the theorem to compute diffraction from a grating, which may be thought of as a 
mask with many closely spaced identical slits. However, the array theorem can be 
applied to apertures with any shape, as suggested by Fig. 1 1.9. 

Consider N apertures in a mask, each with the identical field distribution 
described by -Eaperture (x'> Y 0) • Outside of the aperture, we suppose ^aperture (x'> /< 0) 
is zero, so in a diffraction integral we won't need to worry about the limits of 
integration; we can just integrate over the whole mask. Each identical aperture has 
a unique location on the mask. Let the location of the n th aperture be designated 
by the coordinates [x' n ,y' n ). The field associated with the n th aperture is then 
Uaperture [x' - x' n , y ' - y' n , 0) , where the offset in the arguments shifts the lo cation of 
the aperture. The field comprising all of the identical apertures is 



N 



E[x',y',0) = £ E apmtme {x' -x' n ,y' ~y' n ,0) 
n=l 



(11.19) 



We next compute the Fraunhofer diffraction pattern for the above field. Upon 
inserting (11.19) into the Fraunhofer diffraction formula (10.19) we obtain 



E[x,y,z) = 



e ikz e i±{x 2 +y 2 ) JV 



Az 



£ f dx' f dy' E^ eame [x l - x' n ,y' - y n ,Q) 



e -i K z (xx'+yy') 



(11.20) 



where we have taken the summation out in front of the integral. We have also 
integrated over the entire (infinitely wide) mask since -Eaperture is nonzero only 
inside each aperture. 

Even without yet choosing the shape of the identical apertures, we can make 
some progress on ( 1 1 .20) with the change of variables x" = x' - x' n and y" = y' - y' n : 



E [x, y,z) = - i 



e ikz e i±{x 2 +y 2 ) N 



Az 



f; ( dx" [ d/'E aperture (x",/',0) 



xe -if[x(x"+x; i ) + y(y + K)] 



(11.21) 

Next we pull the factor exp {- i | {xx' n + yy' n )} out in front of the integral to arrive 
at our final result: 



E[x,y,z) = 



N 

E 

n=l 



e * 



if {xx^+yyj 



e ikz e i£{x 2 +y 2 ) 



Xz 



f dx' j dy ! E apectaie {x',y',0)e- i l^ + yy' > > 



-oo — oo 



(11.22) 
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For the sake of elegance, we have traded back x' for x" and y' for y" as the 
variables of integration. Equation (11.22) is known as the array theorem. 2 Note 
that the second factor in brackets is exactly the Fraunhofer diffraction pattern 
from a single aperture centered on x' = and y' = 0. When more than one 
identical aperture is present, we only need to evaluate the Fraunhofer diffraction 
formula for a single aperture. Then, the single-aperture result is multiplied by the 
summation in front, which entirely contains the information about the placement 
of the (many) identical apertures. 

Example 11.2 

Calculate the Fraunhofer diffraction pattern for two identical circular apertures 
with diameter £ whose centers are separated by a spacing h. 

Solution: As computed previously, the single-slit Fraunhofer diffraction pattern 
from a circular aperture (see (10.29)) is 



f— ) 2 


n h{k£p/2z) 


\AXzj 





From the array theorem (11.22), the intensity of the overall diffraction pattern is 
l(x,y,z) = 



2 
n=l 



tne z \ 



2 ^ * h(k£pl2z) 



[kepiiz) 

Let y\ - y' 2 - 0- To create the separation h, let x' x - -h/2 and x' 2 = h/2. Then 

-ij{xx'„+yy'„) _ -i\ 



' + e 



:k\hx\ 

M 2 I =2cos 



~2z~J 



The overall pattern then becomes 



< 2 h{kep/2z) 



[kipl2z\ 



cos 



2 I khx\ 



This pattern can be seen in Fig. 11.10. 



2 A somewhat abstract alternative route to the array theorem recognizes that the field for 
each aperture can be written as a 2-D convolution (see P0.26) between the aperture function 
^aperture [ x ' > y') o) and delta functions specifying the aperture location: 

oo oo 

Aperture [x' ~ x' n , y' - x' n ,0) = J dx' j dy' S [x" - x' n ) S [y" - x' n ) fapert^e [x' - x" , / - /', 0) 
-oo — oo 

The integral in (1 1 .20) therefore may be viewed as a 2-D Fourier transform of a convolution, where 
kxlz and kyl z play the role of spatial frequencies. The convolution theorem (see P0.26) indicates 
that this is the same as the product of Fourier transforms. The 2-D Fourier transform for the delta 
function (times 2n) is 



f dx" f dy"6[x"-x' n )8{y"-y' n ) 



e -il(xx"+yy") _ e . z 



i! l( xx n+yyn) 



Figure 11.10 Fraunhofer diffrac- 
tion pattern from two identical 
circular holes separated by twice 
their diameters. 



The array theorem (11.22) exhibits this factor. It multiplies the single-slit Fraunhofer diffraction 
integral, which is the Fourier transform of the other function. 
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1 1 .4 Diffraction Grating 




Figure 11.11 Transmission grating. 



In this section we will use the array theorem to calculate the diffraction from 
a grating comprised of an array of equally spaced identical slits. An array of 
uniformly spaced slits is called a transmission grating (see Fig. 11.11). Reflection 
gratings are similar, being composed of an array of narrow rectangular mirrors 
that behave similarly to the slits. 

The Fraunhofer diffraction pattern from a single rectangular aperture was 
previously calculated (see Example 10.4 and problem P10.8): 



J aperture 



[x,y,z] 



Az 



Az 



x sine 



fjrAy 
Az 



y\ (n.23) 



The only part of (11.22) that remains to be evaluated is the summation out in 
front. Let the apertures be positioned at 



N+l\ 



h, 



y'n 







(11.24) 



where N is the total number of slits. Then the summation in the array theorem, 
(11.22), becomes 



N 



N 



£ e -i^<+yy' n ) = e <^m £ e -t*?n (1L25) 

n=l n=\ 

This summation is recognized as a geometric sum, which can be performed using 
formula (0.65). 

Equation (11.25) then simplifies to 



N 



: khx 6 



-1 - 



<-N 



1 



khx 



_j!£nx N 

e 22 ■ 



e 2z 



e * 
n sin 



(11.26) 



_ khx : khx 

e 1 -iz — e 2 Z 



sin 



khx\ 
. 2z J 



By combining (11.23) and (11.26) we obtain the full Fraunhofer diffraction pattern 
for a diffraction grating. The expression for the field is 



E [x, y, z) 



sin 



-iE 



AxAye 
Xz 



ikz 



-e 2 



7TAx 

Az 



x sine 



'nAy 
Az 



y 



(11.27) 

Now let us suppose that the slits are really tall (parallel to the y-dimension) 
such that Ay » A. If the slits are infinitely tall, the final sine function in Eq. (11.27) 
can be approximated as one. 3 The intensity pattern in the horizontal direction 



This is mostly the right idea, but is still a bit of a fake. In fact, the field often does not have a 
uniform phase along the entire slit in the y-dimension, so our use of the function sine [(^Ay/Az) y] 
was inappropriate to begin with. The energy in a real spectrometer is usually spread out in a diffuse 
pattern in the y-dimension. However, its form in y is of little relevance; the spectral information is 
carried in the x-dimension only. 
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can then be written in terms of the peak intensity of the diffraction pattern on the 
screen: 



7(x) = /peaksinc 



Az 



Az 



(11.28) 



iVW(^) 

Note that lim ^£L = N so we have placed N 2 in the denominator when intro- 

a^O sma 

ducing our definition of I pe£ ±, which represents the intensity on the screen at 
x = 0. In principle, the intensity / pea k is a function of y and depends on the exact 
details of how the slits are illuminated as a function of y, but this is usually not of 
interest as long as we stay with a given value of y as we scan along x. 

It is left as an exercise to study the functional form of (11.28), especially how 
the number of slits N influences the behavior. The case of N = 2 describes 
the diffraction pattern for a Young's double slit experiment. We now have a 
description of the Young's two-slit pattern in the case that the slits have finite 
openings of width Ax rather than infinitely narrow ones. 

A final note: You may wonder why we are interested in Fraunhofer diffraction 
from a grating. The reason is that we are actually interested in separating different 
wavelengths by observing their distinct diffraction patterns separated in space. In 
order to achieve good spatial separation between light of different wavelengths, 
it is necessary to allow the light to propagate a far distance. Optimal separation 
(the maximum possible) occurs therefore in the Fraunhofer regime. 

11.5 Spectrometers 

The formula (11.28) can be exploited to make wavelength measurements. This 
forms the basis of a diffraction grating spectrometer. A spectrometer has relatively 
poor resolving power compared to a Fabry-Perot interferometer. Nevertheless, a 
spectrometer is not hampered by the serious limitation imposed by free spectral 
range. A spectrometer is able to measure a wide range of wavelengths simulta- 
neously. The Fabry-Perot interferometer and the grating spectrometer in this 
sense are complementary, the one being able to make very precise measurements 
within a narrow wavelength range and the other being able to characterize wide 
ranges of wavelengths simultaneously. 

To appreciate how a spectrometer works, consider Fraunhofer diffraction 
from a grating, as described by (11.28). The structure of the diffraction pattern 
has various peaks. For example, Fig. 11.12a shows the diffraction peaks from a 
Young's double slit (i.e. N = 2). The diffraction pattern is comprised of the typical 
Young's double-slit pattern multiplied by the diffraction pattern of a single slit. 
(Note that sin 2 (2^)/4sin 2 fe) = cos 2 fe).) 

As the number of slits N is increased, the peaks seen in the Young's double-slit 
pattern tend to sharpen with additional smaller peaks appearing in between. 
Figure 11.12b shows the case for N = 5. The more significant peaks occur when 
sm[nhx/ Az) in the denominator of (11.28) goes to zero. Keep in mind that the 
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Figure 11.12 Diffraction through 
various numbers of slits, each 
with Ax - h/2 (slit widths half 
the separation). The dotted line 
shows the single slit diffraction 
pattern, (a) Diffraction from a 
double slit, (b) Diffraction from 5 
slits, (c) Diffraction from 10 slits, 
(d) Diffraction from 100 slits. 
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Figure 11.13 Animation showing 
diffraction through a number of 
slits. 



numerator goes to zero at the same places, creating a zero-over-zero situation, so 
the peaks are not infinitely tall. 

With larger values of N, the peaks can become extremely sharp, and the small 
secondary peaks in between are smaller in comparison. Fig. 11.12c shows the 
case of N= 10 and Fig. 11.12d, shows the case of N = 100. 

When very many slits are used, the diffraction pattern becomes very useful for 
measuring spectra of light, since the position of the diffraction peaks depends on 
wavelength (except for the center peak at x = 0). If light of different wavelengths 
is simultaneously present, then the diffraction peaks associated with different 
wavelengths appear in different locations. It helps to have very many slits involved 
(i.e. large AO so that the diffraction peaks are sharply defined. Then closely spaced 
wavelengths can be more easily distinguished. 

Consider the inset in Fig. 11.12d, which gives a close-up view of the first-order 
diffraction peak for N = 100. The location of this peak on a distant screen varies 
with the wavelength of the light. How much must the wavelength change to cause 
the peak to move by half of its 'width' as marked in the inset of Fig. 1 1 . 12d? We 
will say that this is the minimum separation of wavelengths that still allows the 
two peaks to be distinguished. 

Finding the Minimum Distinguishable Wavelength Separation 

As mentioned, the main diffraction peaks occur when the denominator of (11.28) 
goes to zero, i.e. 

nhx 

(11.29) 



nhx 
Xz 



mn 



The numerator of (11.28) goes to zero at these same locations (i.e. NnhxIXz - 
Nmn), so the peaks remain finite. If two nearby wavelengths Ai and A2 are sent 
through the grating simultaneously, their m th peaks are located at 



xi = 



mzX\ 



h 



and X2 = 



mzXo 



(11.30) 



mz 

■Xi — X\- AA 

h 



(11.31) 



These are spatially separated by 

Ax A : 

where AA = A2 - Ai . 

Meanwhile, we can find the spatial width of, say, the first peak by considering the 
change in x\ that causes the sine in the numerator of (11.28) to reach the nearby 
zero (see inset in Fig. 11.12d). This condition implies 



Jlk (Xl + AXpeak) at 

N — — - - Nmn + n 

X\z 



(11.32) 



We will say that two peaks, associated with X\ and A2, are barely distinguishable 
when Ax A = Ax peak . We also substitute from (11.30) to rewrite (11.32) as 

nh{mzX\lh + mzAX/h) X 

-Nmn + n => AA= 

Ai z Nm 



N- 



(11.33) 



Here we have dropped the subscript on the wavelength in the spirit of Ai « A2 ~ A. 
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As we did for the Fabry-Perot interferometer, we can define the resolving 
power of the diffraction grating as 

A 

RP= — = mN (11.34) 
AA 

The resolving power is proportional to the number of slits illuminated on the 
diffraction grating. The resolving power also improves for higher diffraction 
orders m. 

Example 11.3 

What is the resolving power with m = 1 of a 2-cm-wide grating with 500 slits per 
millimeter, and how wide is the lst-order diffraction peak for 500-nm light after 
1-m focusing? 

Solution: From (11.34) the resolving power is 

500 4 

RP = mN = 2 cm = 10 4 

0.1 cm 

and the minimum distinguishable wavelength separation is 

AA = XI RP = 500 nm/10 4 = 0.05 nm 
From (11.31), with z — «■ /, we have 

mf 1 m 

Ax = — — AA = E — 0.05nm = 25 urn 

h 2 x 10" 6 m 



11.6 Diffraction of a Gaussian Field Profile 

Consider a Gaussian field profile described, at the plane z = 0, with the functional 
form 

E{x',y',0) = E e w o (11.35) 

where w , called the beam waist, specifies the radius of Gaussian profile. It is 
depicted in Fig. 11.14. To better appreciate the meaning of w , consider the 

intensity of the above field distribution: £ (?> V> z > 

E(x',y',0) =E e 

l{x',y',0) = I e- 2 P' 2lw » (11.36) 




z-axis 



where p' 2 = x' 2 + y' 2 . In (11.36) we see that w indicates the radius at which the 
intensity reduces by the factor e~ 2 = 0.135. 

We would like to know how this field evolves when it propagates forward from 
the plane z = 0. We compute the field downstream using the Fresnel approxima- 
tion (10.13): Figure 11.14 Diffraction of a Gaus- 
sian field profile. 



z = 
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E[x,y,z) = -i 



■ i • k f 7 ?\ OO OO 

e ikz e if z {x l +y l ) 



Xz 



J dx' J dy , [E e-( x ° + y' Z V<]e i 



-OO — OO 



(11.37) 



The Gaussian profile itself limits the dimension of the emission region, so there is 
no problem in integrating to infinity. Equation (11.37) can be rewritten as 



E[x,y,z) = -i 



Xz 



J dx' 



J : k \ v '2_; kx v l 

2 l 2z ,x ' - x 



(11.38) 



The integrals over x' and y' have the identical form and can be done individually 
with the help of the integral formula (0.55). The algebra is cumbersome, but the 
integral in the x' dimension becomes 



1 ; *: v l2 ; kx „/ 







exp 



7T 



exp 



-kx 1 



Xz 



a i tan- 1 A- 



exp 



-kx 2 



2z 
kwi 



+ I 



2z 



1 + 1 



2z ' 
kw\ i 



(11.39) 

A similar expression results from the integration on y'. 

When (11.39) and the equivalent expression for the y-dimension are used in 
(11.38), the result is 



M^L\A T+i k 



E[x,y,z) = E 



e ikz e if z {x 2 +y 2 ) ^° 



2 ,„? Tl 2z 



-i tan 



(11.40) 



1 + 



This rather complicated-looking expression for the field distribution is in fact very 
useful and can be directly interpreted, as discussed in the next section. 

Gaussian Field in Cylindrical Coordinates 



A Gaussian field profile is one of few diffraction problems that can be handled con- 
veniently in either the Cartesian (as above) or cylindrical coordinate. In cylindrical 
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coordinates, the Fresnel diffraction integral (10.27) is 



E[p,z): 



2nie ikz e i 2 
Xz 



kp^ 00 



-J p'dp'E e 



.,2 , kp' 2 j kpp' 



We can use the integral formula (0.59) to obtain 



ftp) 



E[p,z) = -iE 



2ne ikz e'^r e ' '"5 



2 4 



1 -i * 

~ ! 2z 



Xz 



J__/iL 



gikZgi 2z 1+ 



>2 2 T '22 



1 + 



2z 



which is identical to (11.40). 



1 1 .7 Gaussian Laser Beams 

The cumbersome Gaussian-field expression (11.40) can be cleaned up through 
the judicious introduction of new quantities: 

E(p,z) = E e 2Ji(z) z o (11.41) 

w{z) 



where 



p 2 = x 2 + y 2 , (11.42) 



w{z) = w Jl + z 2 /z 2 , (11.43) 



R{z) = z + z 2 /z, (11.44) 
ku>l 

^ = ^~ (11-45) 

This formula describes the lowest-order Gaussian mode, the most common laser 
beam profile. (Please be aware that some lasers are multimode and exhibit more 
complicated structures.) 

It turns out that (11.41) works equally well for negative values of z. The 
expression can therefore be used to describe the field of a simple laser beam 
everywhere (before and after it goes through a focus) . In fact, the expression 
works also near z = 0! 4 At z = the diffracted field (11.41) returns the exact 



4 There is good reason for this since the Fresnel diffraction integral is an exact solution to the 
paraxial wave equation (10.15). The beam (11.41) therefore satisfies the paraxial wave equation for 
allz. 
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expression for the original field profile (11.35) (see Pll.ll). In short, (11.41) may 
be used with impunity as long as the divergence angle of the beam is not too wide. 

To begin our interpretation of (11.41), consider the intensity profile I oc E*E 
as depicted in Fig. 11.15: 





z-axis 










1 - 


zo 


— » 



z = 

Figure 11.15 A Gaussian laser field 
profile in the vicinity of its beam 
waist. 




I{p,z) = l 



w 2 (z) 



w 2 {z) 



l + z 2 /z 2 



2p' 
Q ff 2 (z) 



(11.46) 



By inspection, we see that w (z) gives the radius of the beam anywhere along 
z. At z = 0, the beam waist, w (z = 0) reduces to u> , as expected. The parameter 
z , known as the Rayleigh range, specifies the distance along the axis from z = 
to the point where the intensity decreases by a factor of 2. Note that w and z 
are not independent of each other but are connected through the wavelength 
according to (1 1.45). There is a tradeoff: a small beam waist means a short depth 
of focus. That is, a small w means a small Rayleigh range z . 

We next consider the phase terms that appear in the field expression (1 1.41). 
The phase term ikz+ ikp 2 l2R{z) describes the phase of curved wave fronts, 
where R (z) is the radius of curvature of the wave front at z. At z = 0, the radius of 
curvature is infinite (see (11.44)), meaning that the wave front is flat at the laser 
beam waist. In contrast, at very large values of z we have R{z) = z (see (1 1.44)). 

In this case, we may write these phase terms as kz + jmT) — zl + P 2 - This 
describes a spherical wave front emanating from the origin out to point [p, z). The 
Fresnel approximation (same as the paraxial approximation) represents spherical 
wave fronts with the former parabolic approximation. As a reminder, to restore the 
temporal dependence of the field, we append e~ lwt to the solution, as discussed 
in connection with (10.4). 

The phase -i tan" 1 z/z is perhaps a bit more mysterious. It is called the Gouy 
shift and is actually present for any light that goes through a focus, not just laser 
beams. The Gouy shift is not overly dramatic since the expression tan" 1 z/z 
ranges from -nl2 (at z = -oo) to nl2 (at z = +oo). Nevertheless, when light goes 
through a focus, it experiences an overall phase shift of n. 



Example 11.4 

Write the beam waist w in terms of the f-number, defined to be the ratio of z to 
the diameter of the beam diameter 2 w{z) far from the beam waist. 

Solution: Far away from the beam waist (i.e. z >> Zq) the laser beam expands 
along a cone. That is, its diameter increases in proportion to distance. 



w (z) = w u 1 + z 2 /z„ — ► w a zlz Q 



The cone angle is parameterized by the f-number, the ratio of the cone height to 
its base: 

f» s lim = = Jl_ 

z— ±oo2w(z) 2w z/z 2w 



Figure 11.16 
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Substitution of (11.45) into this expressions yields 



2Xf 



■# 



w = 



(11.47) 



71 



Equation (1 1.47) gives a convenient way to predict the size of a laser focus. One 
calculate the f-number by dividing the diameter of the beam at a lens by the 
distance to the focus. However, in practice you may be very surprised at how badly 
a beam focuses compared to the theoretical prediction (due to aberrations, etc.). 
It is always good practice to directly measure your focus if its size is important to 
an experiment. 



Appendix 1 1 .A ABCD Law for Gaussian Beams 

In this section we discuss and justify the ABCD law for Gaussian beams. The 
law enables one to predict the parameters of a Gaussian beam that exits from an 
optical system, given the parameters of an input Gaussian beam. To make the 
prediction, one needs only the ABCD matrix for the optical system, taken as a 
whole. The system may be arbitrarily complex with many optical components. 

At first, it may seem unlikely that such a prediction should be possible since 
ABCD matrices were introduced to describe the propagation of rays. On the other 
hand, Gaussian beams are governed by the laws of diffraction. As an example of 
this dichotomy, consider a collimated Gaussian beam that traverses a converging 
lens. By ray theory, one expects the Gaussian beam to focus near the focal point 
of the lens. However, a collimated beam by definition is already in the act of going 
through focus. In the absence of the lens, there is a tendency for the beam to grow 
via diffraction, especially if the beam waist is small. This tendency competes with 
the focusing effect of the lens, and a new beam waist can occur at a wide range of 
locations, depending on the exact outcome of this competition. 

A Gaussian beam is characterized by its Rayleigh range z . From this, the 
beam waist radius w may be extracted via (11.45), assuming the wavelength is 
known. Suppose that a Gaussian beam encounters an optical system at position 
z, referenced to the position of the beam's waist as shown in Fig. 11.17. The beam 
exiting from the system, in general, has a new Rayleigh range z' . The waist of the 
new beam also occurs at a different location. Let z' denote the location of the exit 
of the optical system, referenced to the location of the waist of the new beam. If 
the exiting beam diverges as in Fig. 11.17, then it emerges from a virtualbeam 
waist located before the exit point of the system. In this case, z' is taken to be 
positive. On the other hand, if the emerging beam converges to an actual waist, 
then z' is taken to be negative since the exit point of the system occurs before the 
focus. 

The ABCD law is embodied in the following relationship: 



A{z+ iz ) + B 



(11.48) 



z + iz = 



C{z + iz ) + D 
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Figure 11.17 Gaussian laser beam traversing an optical system described by an ABCD 
matrix. The dark lines represent the incoming and exiting beams. The gray line repre- 
sents where the exiting beam appears to have been. 



where A, B, C, and D are the matrix elements of the optical system. The imaginary 
number i = v-T imbues the law with complex arithmetic. It makes two equations 
from one, since the real and imaginary parts of (11.48) must separately be equal. 

We now prove the ABCD law. We begin by showing that the law holds for 
two specific ABCD matrices. First, consider the matrix for propagation through a 
distance d: 

r a r l r i (j l 

(11.49) 

We know that simple propagation has minimal effect on a beam. The Rayleigh 
range is unchanged, so we expect that the ABCD law should give z' = z . The 
propagation through a distance d modifies the beam position by z' = z + d. We 
now check that the ABCD law agrees with these results by inserting (11.49) into 
(11.48): 



A 


B 




Id 


C 


D 




1 



z' + iZn = 



1 [z + iZo) + d 
0{z + iz ) + 1 



= z + d + iz Q (propagation through distance d) (11.50) 



Thus, the law holds in this case. 

Next we consider the ABCD matrix of a thin lens (or a curved mirror): 



A B 




1 


CD 




-l/f 1 



(11.51) 



A beam that traverses a thin lens undergoes the phase shift - kp 2 /2f, according 
to (11.11). This modifies the original phase of the wave front kp 2 /2R (z), seen in 
(11.41). The phase of the exiting beam is therefore 



kp 2 kp 2 
2R{z') ~ 2R{z) 



kp 2 



(11.52) 



where we do not keep track of unimportant overall phases such as kz or kz' . With 
(11.44) this relationship reduces to 



1 



1 



R(z') R(z) 



1 

7 



z' + z'llz' 



z + z 2 lz f 



(11.53) 
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In addition to this relationship, the local radius of the beam given by (11.43) 
cannot change while traversing the 'thin' lens. Therefore, 



w [z') = w (z) => z„ 1 + 



(11.54) 



On the other hand, the ABCD law for the thin lens gives 
1 (z+ izo) + 



z' + iz' n = 



(traversing a thin lens with focal length f) 



■(l//)(z+/zb) + l 

(11.55) 

It is left as an exercise (see PI 1.14) to show that (11.55) is consistent with (11.53) 
and (11.54). 

So far we have shown that the ABCD law works for two specific examples, 
namely propagation through a distance d and transmission through a thin lens 
with focal length /. From these elements we can derive more complicated sys- 
tems. However, the ABCD matrix for a thick lens cannot be constructed from just 
these two elements. However, we can construct the matrix for a thick lens if we 
sandwich a thick window (as opposed to empty space) between two thin lenses. 
The proof that the matrix for a thick window obeys the ABCD law is left as an 
exercise (see PI 1.1 7). With these relatively few elements, essentially any optical 
system can be constructed, provided that the beam propagation begins and ends 
in the same index of refraction. 

To complete our proof of the general ABCD law, we need only show that when 
it is applied to the compound element 



A B 




A 2 B 2 




' A, 5 : 




C D 




C 2 D 2 




. Q D l 





A 2 A X +B 2 C\ 

CzAy + Dzd 



A 2 B l +B 2 D l 
C 2 B l + D 2 D l 



(11.56) 



it gives the same answer as when the law is applied sequentially, first on 



Ax Bi 
Ci Di 



and then on 



A 2 B 2 
C 2 D 2 
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Explicitly, we have 

„ . „ A 2 {z' + iz' )+B 2 

Z + lZ n = ; 

C 2 [z' + iz' ) + D 2 



A 2 




+ B 2 


Ci (z+iz )+Di 


c 2 


Ailz+izoi+Bi 


+ D 2 


Ci(z+iz )+D 1 



= A 2 [Ai (z + izp) + Bj] + B 2 [Ci (z + izp) + Dij Ui.b/J 
C 2 [Ai (z + izo) + Bi] + D 2 [Ci (z + iz ) + Dj] 

= {A2A1 + B 2 Ci) (z + izp) + (Agji + B 2 Di) 

~ [C 2 A Y + D 2 d) (z + izo) + (C 2 Bi + D 2 Di) 
A (z + z'z ) + B 
C{z+ iz ) + D 

Thus, we can construct any ABCD matrix that we wish from matrices that are 
known to obey the ABCD law. The resulting matrix also obeys the ABCD law. 
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Exercises for 11.1 Fraunhofer Diffraction Through a Lens 



PI 1.1 Fill in the steps leading to (11.14) from (11.13). Show that the intensity 
distribution (11.6) is consistent with (11.14). 

LI 1.2 Set up a collimated 'plane wave' in the laboratory using a HeNe laser 
(A = 633 nm) and appropriate lenses. 

(a) Choose a rectangular aperture (Ax by Ay) and place it in the plane 
wave. Observe the Fraunhofer diffraction on a very far away screen (i.e., 
where z » | (aperture radius) 2 is satisfied). Check that the location of 
the 'zeros' agrees with (10.20). 

(b) Place a lens in the beam after the aperture. Use a CCD camera 
to observe the Fraunhofer diffraction profile at the focus of the lens. 
Check that the location of the 'zeros' agrees with (10.20), replacing z 
with /. 

(c) Repeat parts (a) and (b) using a circular aperture with diameter £. 
Check the position of the first 'zero', (video) 




Laser 



Removable T\ 

mirror Aperture 



Figure 11.18 



Exercises for 11.2 Resolution of a Telescope 

PI 1.3 On the night of April 18, 1775, a signal was sent from the Old North 
Church steeple to Paul Revere, who was 1.8 miles away: "One if by 
land, two if by sea." If in the dark, Paul's pupils had 4 mm diameters, 
what is the minimum possible separation between the two lanterns 
that would allow him to correctly interpret the signal? Assume that the 
predominant wavelength of the lanterns was 580 nm. 

HINT: In the eye, the index of refraction is about 1.33 so the wavelength 
is shorter. This leads to a smaller diffraction pattern on the retina. 
However, in accordance with Snell's law, two rays separated by an angle 
580 nm outside of the eye are separated by an angle 0/1.33 inside the 
eye. The two rays then hit on the retina closer together. As far as 
resolution is concerned, the two effects exactly compensate. 
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LI 1.4 Simulate two stars with laser beams (A = 633 nm). Align them nearly 
parallel with a small lateral displacement. Send the beams down a long 
corridor until diffraction causes both beams to grow into one another 
so that it is no longer apparent that they are from two distinct sources. 
Use a lens to image the two sources onto a CCD camera. The camera 
should be placed close to the focal plane of the lens. Use a variable iris 
near the lens to create different pupil openings. 



| Laser | 



j Laser | 




Pupil 



Figure 11.19 



Experimentally determine the pupil diameter that just allows you to 
resolve the two sources according to the Rayleigh criterion. Check your 
measurement against theoretical prediction, (video) 

HINT: The angular separation between the two sources is obtained by 
dividing propagation distance into the lateral separation of the beams. 




Exercises for 1 1.3 The Array Theorem 

PI 1.5 Find the diffraction pattern created by an array of nine circles, each 
with radius a, which are centered at the following (x', y') coordinates: 
i-b,b), [0,b), {b,b), (-b,0), (0,0), [b,0), {-b,-b), (0,-b), (b,-b) (a is 
less than b). Make a plot of the result for the situation where (in some 
choice of units) a = 1, b = 5a, and kid = 1. View the plot at different 
"zoom levels" to see the finer detail. 



PI 1.6 



Figure 11.20 



(a) A plane wave is incident on a screen of N 2 uniformly spaced identi- 
cal rectangular apertures of dimension Ax by Ay (see Fig. 11.20). Their 
positions are described by x n = h[n- ^jp) andy ra = s[m- ^y^). Find 
the far-field (Fraunhofer) pattern of the light transmitted by the grid. 

(b) You look at a distant sodium street lamp (somewhat monochro- 
matic) through a curtain made from a fine mesh fabric with crossed 
threads. Make a sketch of what you expect to see (how the lamp will 
look to you). 

HINT: Remember that the lens of your eye causes the Fraunhofer 
diffraction of the mesh to appear at the retina. 
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Exercises for 11.4 Diffraction Grating 

PI 1 .7 Consider Fraunhofer diffraction from a grating of N slits having widths 
Ax and equal separations h. Make plots (label relevant points and 
scaling) of the intensity pattern for N = I, N = 2, N = 5, and N = 
1000 in the case where h = 2Ax, Ax = 5 ;um, and A = 500 nm. Let the 
Fraunhofer diffraction be observed at the focus of a lens with focal 
length / = 100 cm. Do you expect / pe ak to be the same value for all of 
these cases? 

PI 1.8 For the case of N = 1000 in P11.7, you wish to position a narrow slit at 
the focus of the lens so that it transmits only the first-order diffraction 
peak (i.e. at khx/[2f) = ±n). (a) How wide should the slit be if it is to 
be half the separation between the first intensity zeros to either side of 
the peak? 

(b) What small change in wavelength (away from A = 500 nm) will 
cause the intensity peak to shift by the width of the slit found in part 

(a)? 



Exercises for 11.5 Spectrometers 



LI 1.9 (a) Use a HeNe laser to determine the period h of a reflective grating. 

(b) Give an estimate of the blaze angle <p on the grating. HINT: Assume 
that the blaze angle is optimized for first-order diffraction of the HeNe 
laser (for one side) at normal incidence. The blaze angle enables a 
mirror-like reflection of the diffracted light on each groove, (video) 

(c) You have two mirrors of focal length 75 cm and the reflective grating 
in the lab. You also have two very narrow adjustable slits and the ability 
to 'tune' the angle of the grating. Sketch how to use these items to make 
a monochromator (scans through one wavelength at a time) . If the 
beam that hits the grating is 5 cm wide, what do you expect the ultimate 
resolving power of the monochromator to be in the wavelength range 
of 500 nm? Do not worry about aberration such as astigmatism from 
using the mirrors off axis. 




Figure 11.21 
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Light out <■ 



Light in 




Figure 11.22 



LI 1.10 Study the Jarrell Ash monochromator. Use a tungsten lamp as a source 
and observe how the instrument works by taking the entire top off. 
Do not breathe or touch when you do this. In the dark, trace the light 
inside of the instrument with a white plastic card and observe what 
happens when you change the wavelength setting. Place the top back 
on when you are done, (video) 

(a) Predict the best theoretical resolving power that this instrument can 
do assuming 1200 lines per millimeter. 

(b) What should the width Ax of the entrance and exit slits be to obtain 
this resolving power? Assume A = 500 nm. 

HINT: Set Ax to be the distance between the peak and the first zero of 
the diffraction pattern at the exit slit for monochromatic light. 

Exercises for 11.7 Gaussian Laser Beams 
Pll.ll (a) Confirm that (11.41) reduces to (11.35) when z = 0. 

(b) Take the limit z » z to find the field far from the laser focus. 

P11.12 Use the Fraunhofer integral formula (either (10.19) or (10.28)) to deter- 
mine the far-field pattern of a Gaussian laser focus (11.35). 

HINT: The answer should agree with Pll.ll part (b). 

LI 1.13 Consider the following setup where a diverging laser beam is collimated 
using an uncoated lens. A double reflection from both surfaces of the 
lens (known as a ghost) comes out in the forward direction, focusing 
after a short distance. Use a CCD camera to study this focused beam. 
The collimated beam serves as a reference to reveal the phase of the 
focused beam through interference. Because the weak ghost beam 

Figure 11.23 
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concentrates near its focus, the two beams can have similar intensities 
for optimal interference effects, (video) 5 



| Laser } 



Filter Pin Hole 
_? 
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150 cm- 



Figure 11.24 

The ghost beam Ei [p, z) is described by (11.41), where the origin is at 
the focus. Let the collimated beam be approximated as a plane wave 
E 2 e where cf> is the relative phase between the two beams. The 

net intensity is then I t (p,z) oc \E\ [p,z) + £ , 2 e lfcz+ "^| 2 or 



It{p,z) 



h + h [p, z) + 2Jl 2 h (p. z) cos 



( kp 2 



\2R{z) 



tan 



z 



where I\ (p,z) is given by (11.46). We now have a formula that retains 
both R (z) and the Gouy shift tan -1 zl z , which are not present in the 
intensity distribution of a single beam (see (11.46)). 

(a) Determine the f-number for the ghost beam (see Example 11.4). 
Use this measurement to predict a value for w . HINT: You know that 
at the lens, the focusing beam is the same size as the collimated beam. 

(b) Measure the actual spot size w at the focus. How does it compare 
to the prediction? 

HINT: Before measuring the spot size, make a subtle adjustment to 
the tilt of the lens. This incidentally causes the phase between the two 
beams to vary by small amounts, which you can set to (p = ±nl2. Then 
at the focus the cosine term vanishes and the two beams don't interfere 
(i.e. the intensities simply add). This is accomplished if the center of 
the interference pattern is as dark as possible either far before or far 
after the focus. 

(c) Observe the effect of the Gouy shift. Since tan -1 z/z varies over a 
range of n, you should see that the ring pattern before versus after the 
focus inverts (i.e. the bright rings exchange with the dark ones). 

(d) Predict the Rayleigh range z and check that the radius of curvature 
R (z) = z + z 2 1 z agrees with measurement. 

HINT: You should see interference rings similar to those in Fig. 11.25. 
The only phase term that varies with p is kp 2 12R (z). If you count N 
fringes out to a radius p, then kp 2 /2R (z) has varied by 2nN. 
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Figure 11.25 



5 J. Peatross and M. V Pack, "Viewing the Mathematical Structure of Gaussian Laser Beams in a 
Student Laboratory," Am. J. Phys. 69, 1169 (2001). 
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Exercises for 1 1.AABCD Law for Gaussian Beams 

P11.14 Find the solutions to (11.55) (i.e. find z' and z' Q in terms of z and z ). 
Show that the results are in agreement with (11.53) and (11.54). 

P11.15 Assuming a collimated beam (i.e. z = and beam waist w ), find the 
location L=-z' and size w' of the resulting focus when the beam goes 
through a thin lens with focal length /. 

LI 1.16 Place a lens in a HeNe laser beam soon after the exit mirror of the cavity. 
Characterize the focus of the resulting laser beam, and compare the 
results with the expressions derived in P11.15. 

PI 1 .1 7 Prove the ABCD law for a beam propagating through a thick window of 
material with matrix 
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Chapter 12 

Interferograms and Holography 



In chapter 8, we studied a Michelson interferometer in an idealized sense: 1) The 
light entering the instrument was considered to be a plane wave. 2) The retro- 
reflecting mirrors were considered to be aligned perpendicular to the beams 
impinging on them. 3) All reflective surfaces were taken to be perfectly flat. If any 
of these conditions are not met, the beam emerging from the interferometer is 
likely to exhibit an interference or fringe pattern. A recorded fringe pattern (on a 
CCD or photographic film) is called an interferogram. In this chapter, we examine 
typical fringe patterns that can be produced in an interferometer. Such patterns 
are very useful for testing the prescription and quality of optical components. 1 

We will also study holography, where an interference pattern (or fringe pat- 
tern) is recorded and then later used to diffract light, in much the same way that 
gratings diffract light. 2 A recorded fringe pattern, when used for this purpose, 
is called a hologram. When light diffracts from a hologram, it can mimic the 
light field originally used to generate the fringe pattern. This is true even for 
complicated fields, recorded when light scatters from arbitrary three-dimensional 
objects. When the light field is re-created through diffraction, the resulting image 
looks three-dimensional, since the holographic fringes re- construct the original 
light field over a wide range of viewing angles. 

12.1 Interferograms 

Consider the Michelson interferometer seen in Fig. 12.1. Suppose that the beam- 
spliter divides the fields evenly, so that the overall output intensity is given by 
(8.1): 

/ t ot = 2J [l + COS(WT)] (12.1) 

As a reminder, t is the roundtrip delay time of one path relative to the other. This 
equation is based on the idealized case, where the amplitude and phase of the two 



^See M. Born and E. Wolf, Principles of Optics, 7th ed., Sect. 7.5.5 (Cambridge: Cambridge 
University Press, 1999). 

2 In fact, a grating can be considered to be a hologram and holographic techniques are often 
employed to produce gratings. 
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< — ► 



Figure 12.1 Michelson interferom- 
eter. 
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Figure 12.2 Fringe patterns for 
a Michelson interferometer: (a) 
Horizontally misaligned beams. 

(b) Vertically misaligned beams. 

(c) Both vertically and horizontally 
misaligned beams, (d) Diverging 
beam with unequal paths, (e) Di- 
verging beam with unequal paths 
and horizontal misalignment. 



beams are uniform and perfectly aligned to each other following the beamsplitter. 
The entire beam 'blinks' on and off as the delay path t is varied. 

What happens if one of the retro-reflecting mirrors is misaligned by a small 
angle 0? The fringe patterns seen in Fig. 12.2 (a)-(c) are the result. By the law 
of reflection, the beam returning from the misaligned mirror deviates from the 
'ideal' path by an angle 26. This puts a relative phase variation of 

(p = kx sin (26 x ) + ky sin [26 y ) (12.2) 

on the misaligned beam. 3 Here 6 X represents the tilt of the mirror in the x- 
dimension and 8 y represents the amount of tilt in the y-dimension. 
When the two plane waves join, the resulting intensity pattern is 

Jtot = 2Jo[l + cos(0 + WT)] (12.3) 

The phase term <p depends on the local position within the beam through x and 
y. Regions of uniform phase, called fringes (in this case individual stripes), have 
the same intensity. As the delay r is varied, the fringes seem to 'move' across the 
detector. In this case, the fringes appear at one edge of the beam and disappear 
at the other. 

Another interesting situation arises when the beams in a Michelson interfer- 
ometer are diverging. A fringe pattern of concentric circles will be seen at the 
detector when the two beam paths are unequal (see Fig. 12.2 (d)). The radius of 
curvature for the beam traveling the longer path is increased by the added amount 
of delay d = tc. Thus, if beam 1 has radius of curvature R\ when returning to the 
beam splitter, then beam 2 will have radius R 2 = R\ + d upon return (assuming flat 
mirrors). The relative phase (see phase term in (11.41)) between the two beams is 

(p=kp 2 l2Ri-kp 2 l2R 2 (12.4) 

and the intensity pattern at the detector is given as before by (12.3). 



12.2 Testing Optical Components 

A Michelson interferometer is ideal for testing the quality of optical surfaces. If 
any of the flat surfaces (including the beam splitter) in the interferometer are 
distorted, the fringe pattern readily reveals it. Figure 12.3 shows an example of a 
fringe pattern when one of the mirrors in the interferometer has an arbitrary de- 
formity in the surface figure} A new fringe stripe occurs for every half wavelength 
that the surface varies. (The round trip turns a half wavelength into a whole 
wavelength.) This makes it possible to determine the flatness of a surface with 
very high precision. Of course, in order to test a given surface in an interferometer, 
the quality of all other surfaces in the interferometer must first be ensured. 

3 This ignores an additive constant for fixed z. 

4 The surface figure is a name for how well a surface contour matches a desired prescription. 



12.3 Generating Holograms 
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A typical industry standard for research-grade optics is to specify the surface 
flatness to within one tenth of an optical wavelength (633 nm HeNe laser). This 
means that the interferometer should reveal no more than one fifth of a fringe 
variation across the substrate surface. The fringe pattern tells the technician how 
the surface should continue to be polished in order to achieve the desired surface 
flatness. Figure 12.3(a) shows the fringe pattern for a surface with significant 
variations in the surface figure. 

When testing a surface, it is not necessary to remove all tilt from the alignment 
before the effects of surface variations become apparent in the fringe pattern. 
In fact, it can be helpful to observe the distortions as deflections in a normally 
regularly striped fringe pattern. Figure 12.3(b) shows fringes from a distorted 
surface when some tilt is left in the interferometer alignment. An important 
advantage to leaving some tilt in the beam is that one can better tell the sign of 
the phase errors. We can see, for example, in the case of tilt that the two major 
distortion regions in Fig. 12.3 have opposite phase; we can tell that one region of 
the substrate protrudes while the other dishes in. On the other hand, this is not 
clear for an interferogram with no tilt. 

Other types of optical components (besides flat mirrors) can also be tested 
with an interferometer. Figure 12.4 shows how a lens can be tested using a 
convex mirror to compensate for the focusing action of the lens. With appropriate 
spacing, the lens-mirror combination can act like a flat surface. Distortions in the 
lens figure are revealed in the fringe pattern. In this case, the surfaces of the lens 
are tested together, and variations in optical path length are observed. In order 
to record fringes, say with a CCD camera, it is often convenient to image a larger 
beam onto a relatively small active area of the detector. The imaging objective 
should be adjusted to produce an image of the test optic on the detector screen. 
The diameter of the objective lens needs to accommodate the whole beam. 

12.3 Generating Holograms 

In the late 1940's, Dennis Gabor developed the concept of holography, but it wasn't 
until after the invention of the laser that this field really blossomed. Consider 
a coherent monochromatic beam of light that is split in half by a beamsplitter, 
similar to that in a Michelson interferometer. Let one beam, called the reference 
beam, proceed directly to a recording film, and let the other beam scatter from 
an arbitrary object back towards the same film. The two beams interfere at the 
recording film. It is best to split the beam initially into unequal intensities such 
that the light scattered from the object has an intensity similar to the reference 
beam at the film. 

The purpose of the film is to record the interference pattern. It is important 
that the coherence length of the light be much longer than the difference in 
path length starting from the beam splitter and ending at the film. In addition, 
during exposure to the film, it is important that the whole setup be stable against 
vibrations on the scale of a wavelength since this will cause the fringes to wash 




Figure 12.3 (a) Fringe pattern aris- 
ing from an arbitrarily distorted 
mirror in a perfectly aligned inter- 
ferometer with plane wave beams, 
(b) Fringe pattern from the same 
mirror as (a) when the mirror is 
tilted (still plane wave beams). 
The distortion due to surface varia- 
tion is still easily seen. 
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Figure 12.4 Twyman-Green setup 
for testing lenses. 



304 



Chapter 12 Interferograms and Holography 




Film 



Beamsplitter 



Figure 12.5 Exposure of holo- 
graphic film. 




Dennis Gabor (1900-1979, Hungarian) 
was born in Budapest. As a teenager, 
he fought for Hungary in World War 
I. Following the war, he studied at the 
Technical University of Budapest and 
later at the Technical University of 
Berlin. In 1927, Gabor completed his 
doctoral dissertation on cathode ray 
tubes and began a long career work- 
ing on electron-beam devices such as 
oscilloscopes, televisions, and electron 
microscopes. It was in the context of 
'electron optics' that he invented the 
concept of holography, which relied on 
the wave nature of electron beams. Ga- 
bor did this work while working for a 
British company, after fleeing Germany 
when Hitler came to power. Holography 
did not become practical until after the 
invention of the laser, which provided 
a bright coherent light source. (Gabor 
had attempted to make holograms ear- 
lier using a spectral line from a mercury 
lamp.) In 1964 the first hologram was 
produced. Soon after, holograms be- 
came commercially available and were 
popularized. Gabor accepted a post as 
professor of applied physics at the Impe- 
rial College of London from 1958 until 
he retired in 1967. He was awarded the 
Nobel prize in physics in 1971 for the 
invention of holography. (Wikipedia) 



out. For simplicity, we neglect the vector nature of the electric field, assuming 
that the scattering from the object for the most part preserves polarization and 
that the angle between the two beams incident on the film is modest (so that the 
electric fields of the two beams are close to parallel). To the extent that the light 
scattered from the object contains the polarization component orthogonal to that 
of the reference beam, it provides a uniform (unwanted) background exposure to 
the film on top of which the fringe pattern is recorded. 

In general terms, we may write the electric field arriving at the film as 5 



^film 



(r) e 



= E 



object 



(r) e- la)t + E iei (r) e 



-itat 



(12.5) 



Here, the coordinate r indicates locations on the film surface, which may have 
arbitrary shape, but often is a plane. The field E oh ^ ea {r), which is scattered from 
the object, is in general very complicated. The field E tei (r) may be equally compli- 
cated, but typically it is convenient if it has a simple form such as a plane wave, 
since this beam must be re-created later in order to view the hologram. 
The intensity of the field (12.5) is given by 



_ 1 I I 2 
/film W - 2 C£ I ^object M + -^ref W | 



1 

= -ce 
2 



| Object M | + |£ ref (r) | 2 + E*t (r) £ obje ct (r) + E ief (r) £* bject (r) 



(12.6) 



For typical photographic film, the exposure of the film is proportional to the 
intensity of the light hitting it. This is known as the linear response regime. That 
is, after the film is developed, the transmittance T of the light through the film is 
proportional to the intensity of the light that exposed it (/ mm ). However, for low 
exposure levels, or for film specifically designed for holography, the transmission 
of the light through the film can be proportional to the square of the intensity 
of the light that exposes the film. Thus, after the film is exposed to the fringe 
pattern and developed, the film acquires a spatially varying transmission function 
according to 

T(r) oc 4 m (r) (12.7) 

If at a later point in time light of intensity / mC jdent is directed onto the film, it will 
transmit according to /transmitted = T {r) I incident . In this case, the field, as it emerges 
from the other side of the film, will be 



where t (r) = s/J\] 



-^transmitted 

(r) = f(r) 

-^incident 

(r) oc 7 fllm (r) 

-^incident 

(r) (12.8) 



1 2.4 Holographic Wavefront Reconstruction 

To see a holographic image, we re-illuminate film (previously exposed and devel- 
oped) with the original reference beam. That is, we send in 



^incident 



(r) = £ ref (r) 



(12.9) 



5 See P. W. Milonni and J. H. Eberly, Lasers, Sect. 16.4-16.5 (New York: Wiley, 1988); G. R. Fowles, 
Introduction to Modern Optics, 2nd ed., Sect. 5.7 (Toronto: Dover, 1975). 
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Image 



and view the light that is transmitted. According to (12.6) and (12.8), the trans- 
mitted field is proportional to 



^transmitted M OC 7fli m (r) E Te f (r) 



| ^object W | + |£refM 



E ie{ {r) + \E ie{ {r] 



I 2 Object (r) + E? et (r) £* bje 



iject W 

(12.10) 

Although (12.10) looks fairly complicated, each of the three terms has a direct 
interpretation. The first term is just the reference beam E iei (r) with an amplitude 
modified by the transmission through the film. It is the residual undeflected beam, 
similar to the zero-order diffraction peak for a transmission grating. The second 
term is interpreted as a reconstruction of the light field originally scattered from 
the object -E^ect M- Its amplitude is modified by the intensity of the reference 
beam, but if the reference beam is uniform across the film, this hardly matters. 
An observer looking into the film sees a wavefront identical to the one produced 
by the original object (superimposed with the other fields in (12.10)). Thus, 
the observer sees a virtual image at the location of the original object. Since 
the wavefront of the original object has genuinely been recreated, the image 
looks 'three-dimensional', because the observer is free to view from different 
perspectives. 

The final term in (12.10) is proportional to the complex conjugate of the 
original field from the object. It also contains twice the phase of the reference 
beam, which we can overlook if the reference beam is uniform on the film. In 
this case, the complex conjugate of the object field actually converges to a real 
image of the original object. This image is located on the observer's side of the 
film, but it is often of less interest since the image is inside out. An ideal screen for 
viewing this real image would be an item shaped identical to the original object, 
which of course defeats the purpose of the hologram! To the extent that the film is 
not flat or to the extent that the reference beam is not a plane wave, the phase of 
£ r 2 ef (r) severely distorts the image. On the other hand, the virtual image previously 
described never suffers from this problem. 



Example 12.1 

Analyze the three field terms in (12.10) for a hologram made from a point object, 
as depicted in Fig. 12.7. 

Solution: Presumably, the point object is illuminated sufficiently brightly so as to 
make the scattered light have an intensity similar to the reference beam at the film. 

Let the reference plane wave strike the film at normal incidence. Then the reference 
field will have constant amplitude and phase across it; call it E Ki . The field from 
the point object can be treated as a spherical wave: 



^object 



(P) 



(point source example) (12.11) 



Film 



]/// 




Observer 



Figure 12.6 Holographic recon- 
struction of wavefront through 
diffraction from fringes on film. 
Compare with Fig. 12.5. 



Reference 
Beam Film 




Point 
Object 




Figure 12.7 Exposure to holo- 
graphic film by a point source 
and a reference plane wave. The 
holographic fringe pattern for a 
point object and a plane wave ref- 
erence beam exposing a flat film is 
shown on the right. 



Here p represents the radial distance from the center of the film to some other 
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Reference Undeflected 
beam Film beam 



Reference 
beam Film 



Virtual 
image , 





Field associated 
with virtual 
image 



Reference 
beam 



Field associated 
with real 
image 




I 

Film 



Figure 12.8 Reference beam in- 
cident on previously exposed 
holographic film, (a) Part of the 
beam goes through, (b) Part of the 
beam takes on the field profile of 
the original object, undeflected. 
(c) Part of the beam converges to a 
real image of the original object. 



point on the film. We have taken the amplitude of the object field to match £ ref in 
the center of the film. 

After the film is exposed, developed, and re-illuminated by the reference beam, the 
field emerging from the right-hand-side of the film, according to (12.10), becomes 



L 2 + p 2 ret 



E rpf + E ri 



refL Jks/L 2 + p 2 



-iksjL 2 +p 2 



(12.12) 



We see the three distinct waves that emerge from the holographic film. The first 
term in (12.12) represents the plane wave reference beam passing straight through 
the film with some variation in amplitude (depicted in Fig. 12.8 (a)). The second 
term in (12.12) has the identical form as the field from the original object (aside 
from an overall amplitude factor). It describes an outward-expanding spherical 
wave, which gives rise to a virtual image at the location of the original point object, 
as depicted in Fig. 12.8 (b). The final term in (12.12) corresponds to a converging 
spherical wave, which focuses to a point at a distance L from the observer's side of 
the screen (depicted in Fig. 12.8 (c)). 
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Exercises for 12.1 Interferograms 

P12.1 An ideal Michelson interferometer that uses flat mirrors is perfectly 
aligned to a wide collimated laser beam. Suppose that one of the mir- 
rors is then misaligned by 0.1°. What is the spacing between adjacent 
fringes on the screen if the wavelength is A = 633 nm? What would 
happen if, instead of tilting one of the mirrors, the angle of the input 
beam (before the beamsplitter) changed by 0.1°? 

P12.2 An ideal Michelson interferometer uses flat mirrors perfectly aligned 
to an expanding beam that diverges from a point 50 cm before the 
beamsplitter. Suppose that one mirror is 10 cm away from the beam 
splitter, and the other is 11 cm. Suppose also that the center of the 
resulting bull's-eye fringe pattern is dark. If a screen is positioned 10 cm 
after the beam splitter, what is the radial distance to the next dark fringe 
on the screen if the wavelength is A = 633 nm? 

Exercises for 12.2 Testing Optical Components 

L12.3 Set up an interferometer and observe distortions to a mirror substrate 
when the setscrew is over tightened. 

Exercises for 12.3 Generating Holograms 

P12.4 Consider a diffraction grating as a simple hologram. Let the light from 
the "object" be a plane wave (object placed at infinity) directed onto 
a flat film at angle 6. Let the reference beam strike the film at normal 
incidence, and take the wavelength to be A. 

(a) What is the period of the fringes? 

(b) Show that when re-illuminated by the reference beam, the three 
terms in (12.10) give rise to zero-order and lst-order diffraction (occur- 
ring on each side of zero-order). 

P12.5 (a) Show that the phase of the real image in (12.12) may be approxi- 
mated as A</> = -kp 2 /2L, aside from a spatially independent overall 
phase. Compare with (11.10) and comment. 

(b) This hologram is similar to a Fresnel zone plate, used to focus 
extreme ultraviolet light or x-rays, for which it is difficult to make a lens. 
Graph the field transmission for the hologram as a function of p and 
superimpose a similar graph for a "best-fit" mask that has regions of 
either 100% or 0% transmission. Use A = 633 nm and L = (5 x 10 5 - |)A 
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(this places the point source about a 32 cm before the screen). See 
Fig. 12.9. 

Consider the holographic pattern produced by the point object de- 
scribed in section 12.4. 

LI 2.6 Make a hologram. 



Figure 12.9 Field transmission for 
a point-source hologram (upper) 
and a Fresnel zone plate (middle), 
and a plot of both as a function of 
radius (bottom). 



Review, Chapters 9-12 



True and False Questions 

R48 T or F: The eikonal equation and Fermat's principle depend on the 
assumption that the wavelength is relatively small compared to features 
of interest. 

R49 T or F: The eikonal equation and Fermat's principle depend on the 
assumption that the index of refraction varies only gradually. 

R50 T or F: The eikonal equation and Fermat's principle depend on the 
assumption that the angles involved must not be too big. 

R51 T or F: The eikonal equation and Fermat's principle depend on the 
assumption that the polarization is important to the problem. 

R52 T or F: Spherical aberration can be important even when the paraxial 
approximation works well. 

R53 T or F: Chromatic aberration (the fact that refractive index depends on 
frequency) is an example of the violation of the paraxial approximation. 

R54 T or F: The Fresnel approximation falls within the paraxial approxima- 
tion. 

R55 T or F: The imaging relation 1 // = 1 / d + 1 / d Y relies on the paraxial ray 
approximation. 

R56 T or F: The spherical waves given by e lkR /R are exact solutions to 
Maxwell's equations. 

R57 T or F: Spherical waves can be used to understand diffraction from 
apertures that are relatively large compared to A. 

R58 T or F: Fresnel was the first to conceive of spherical waves. 

R59 T or F: Spherical waves were accepted by Poisson immediately without 
experimental proof. 
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R60 T or F: The array theorem is useful for deriving the Fresnel diffraction 
from a grating. 

R61 T or F: A diffraction grating with a period h smaller than a wavelength 
is ideal for making a spectrometer. 

R62 T or F: The blaze on a reflection grating can improve the amount of 
energy in a desired order of diffraction. 

R63 T or F: The resolving power of a spectrometer used in a particular 
diffraction order depends only on the number of lines illuminated (not 
wavelength or grating period) . 

R64 T or F: The central peak of the Fraunhofer diffraction from two nar- 
row slits separated by spacing h has the same width as the central 
diffraction peak from a single slit with width Ax = h. 

R65 T or F: The central peak of the Fraunhofer diffraction from a circular 
aperture of diameter £ has the same width as the central diffraction 
peak from a single slit with width Ax = £. 

R66 T or F: The Fraunhofer diffraction pattern appearing at the focus of a 
lens varies in angular width, depending on the focal length of the lens 
used. 

R67 T or F: Fraunhofer diffraction can be viewed as a spatial Fourier trans- 
form (or inverse transform if you prefer) on the field at the aperture. 



ft 



object 

< do- 



Figure 12.10 



image 



di > 



Problems 
R68 

R69 



(a) Derive Snell's law using Fermat's principle. 

(b) Derive the law of reflection using Fermat's principle. 

(a) Consider a ray of light emitted from an object, which travels a 
distance d before traversing a lens of focal length f and then traveling 
a distance d\. 



Write a vector equation relating 



e 2 



to 



yi 
0i 



. Be sure to simplify 



the equation so that only one ABCD matrix is involved. 
HINT: 



1 




Id 


-1// 1 




1 



(b) Explain the requirement on the ABCD matrix in part (a) that ensures 
that an image appears for the distances chosen. From this requirement, 
extract a familiar constraint on d and di. Also, make a reasonable 
definition for magnification M in terms of yj and 3/2, then substitute to 
find M in terms of d and d\. 
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R70 



R71 



(c) A telescope is formed with two thin lenses separated by the sum of 
their focal lengths f\ and fz- Rays from a given far-away point all strike 
the first lens with essentially the same angle d\. Angular magnification 
Mq quantifies the telescope's purpose of enlarging the apparent angle 
between points in the field of view. 

Give a sensible definition for angular magnification in terms of 9\ and 
#2- Use ABCD- matrix formulation to derive the angular magnification 
of the telescope in terms of ft and fa. 



A B 
C D 



(beginning 



(a) Show that a system represented by a matrix 

and ending in the same index of refraction) can be made to look like 
the matrix for a thin lens if the beginning and ending positions along 
the z-axis are referenced from two principal planes, located distances 
pi and p2 before and after the system. 

A B 



HINT: 



C D 



= 1. 



(b) Where are the principal planes located and what is the effective 
focal length for two identical thin lenses with focal lengths / that are 
separated by a distance d = f (see Fig. 12.12)? 

Derive the on-axis intensity (i.e. x,y = 0) of a Gaussian laser beam if 
you know that at z = the electric field of the beam is 
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Figure 12.11 
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Figure 12.12 



E[p',z = 0)=E e »i 



Fresnel: 



E [x, y, d) '■■ 



Ad 



ffE[x',y', 0) e l 23 ^' 2+ >-' 2 ) e~ 1 ~* [xx ' +yy,) dx'dy' 



j e - Ax 2 + Bx + Cdx= ^W e g_ +C _ 

-oo 

R72 (a) You decide to construct a simple laser cavity with a flat mirror and 
another mirror with concave curvature of R = 100 cm. What is the 
longest possible stable cavity that you can make? 

HINT: Sylvester's theorem is 



A 


B 


N _ 1 


AsmNB-sm(N-l)d 


B sin N9 


C 


D 


sin0 


CsinNO 


£>sinN0-sin {N- 1) 
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where cos0 = \ {A + D). 

(b) The amplifier is YLF crystal, which lases at X = 1054 nm. You decide 
to make the cavity 10 cm shorter than the longest possible (i.e. found in 
part (a)). What is the value of w , and where is the beam waist located 
inside the cavity (the place we assign to z = 0)? 

HINT: One can interpret the parameter R (z) as the radius of curvature 
of the wave front. For a mode to exist in a laser cavity, the radius of 
curvature of each of the end mirrors must match the radius of curvature 
of the beam at that location. 

E [p,z ) = E — —e ^W fcz+i 2«w e mn « 
w(z) 

2 2 2 



w (z) = w y 1 + z 2 /Zq 

R{Z) =Z+Zg/Z 



Z = 



kiVg 



R73 (a) Compute the Fraunhofer diffraction intensity pattern for a uni- 
formly illuminated circular aperture with diameter £. 

HINT: 

E{x,y,d) = - M J J E{x',y',G)e- i ^ xx ' + yy'^ dx'dy' 

Ma) = — f e ±iaco < e - e 'Ud' 
2n J 



u 

/ 



a 

J (bx) xdx = —Ji {ab) 
b 

/i(1.22tt) = 
lim = 1 

x^O X 

(b) The first lens of a telescope has a diameter of 30 cm, which is the 
only place where light is clipped. You wish to use the telescope to 
examine two stars in a binary system. The stars are approximately 25 
light-years away. How far apart need the stars be (in the perpendicular 
sense) for you to distinguish them in the visible range of A = 500 nm? 
Compare with the radius of Earth's orbit, 1.5 x 10 8 km. 
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R74 (a) Derive the Fraunhofer diffraction pattern for the field from a uni- 
formly illuminated single slit of width Ax. (Don't worry about the 
y-dimension.) 

(b) Find the Fraunhofer intensity pattern for a grating of N slits of width 
Ax positioned on the mask at x' n = h[n - ^p) so that the spacing 
between all slits is h. 

N _. k , 

HINT: The array theorem says that the diffraction pattern is Y. e 1 d xx " 

n=l 

times the diffraction pattern of a single slit. You will need 

N r N_ 1 

Yr n = r- 

ti r-l 

(c) Consider Fraunhofer diffraction from the grating in part (b) . The 
grating is 5.0 cm wide and is uniformly illuminated. For best resolution 
in a monochromator with a 50 cm focal length, what should the width 
of the exit slit be? Assume a wavelength of A = 500 nm. 

R75 (a) A monochromatic plane wave with intensity / and wavelength A 
is incident on a circular aperture of diameter £ followed by a lens of 
focal length /. Write the intensity distribution at a distance / behind 
the lens. 

(b) You wish to spatially filter the beam such that, when it emerges from 
the focus, it varies smoothly without diffraction rings or hard edges. A 
pinhole is placed at the focus, which transmits only the central portion 
of the Airy pattern (inside of the first zero) . Calculate the intensity 
pattern at a distance / after the pinhole using the approximation given 
in the hint below. 

HINT: A reasonably good approximation of the transmitted field is 
that of a Gaussian E[p,0) = Efe~ p lw °, where Ef is the magnitude of 
the field at the center of the focus found in part (a), and the width 
is w = 2A/* In and / # = f/(. The figure below shows how well the 
Gaussian approximation fits the actual curve. We have assumed that 
the first aperture is a distance / before the lens so that at the focus after 
the lens the wave front is flat at the pinhole. To avoid integration, you 
may want to use the result of PI 1.12 or PI 1.1 1(b) to get the Fraunhofer 
limit of the Gaussian profile. (See figure below.) 

Selected Answers Figure 12.14 

R72: (a) 100 cm (b) 0.32 mm. 
R73: (b) 4.8 x 10 8 km. 



Figure 12.13 
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R74: (c) 5 pm. 



Chapter 13 

Blackbody Radiation 




Hot objects glow. In I860, Kirchhoff proposed that the radiation emitted by hot 
objects as a function of frequency is approximately the same for all materials. 1 
The notion that all materials behave similarly led to the concept of an ideal 
blackbody radiator. Most materials have a certain shininess that causes light to 
reflect or scatter in addition to being absorbed and reemitted. However, light 
that falls upon an ideal blackbody is absorbed perfectly before the possibility of 
reemission, hence the name blackbody. 

The distribution of frequencies emitted by a blackbody radiator is related 
to its temperature. We often consider a blackbody radiator that is in thermal 
equilibrium with the surrounding light that is absorbed and reemitted. If it is 
not in thermal equilibrium, for example, if more light is emitted than absorbed, 
then the object inevitably cools as light escapes to the environment, moving the 
system toward thermal equilibrium. 

The Sun is a good example of a blackbody radiator. The light emitted from the 
Sun is associated with its surface temperature. Any light that arrives to the Sun 
from outer space is virtually 100% absorbed, however little light that might be, so 
the name blackbody aptly describes it. Mostly, light escapes to the much colder 
surrounding space (i.e. it is not in thermal equilibrium), and the temperature of 
the Sun's surface is maintained by the fusion process within. As another example, 
a glowing tungsten filament in an ordinary light bulb may be reasonably described 
as a blackbody radiator. However, surface reflections make it less than ideal both 
for absorption and emission. 

Experimentally, a near perfect blackbody radiator can be constructed from 
a hollow object. An example is shown in Fig. 13.1. As the interior of the object 
is heated, the light present inside the internal cavity is in equilibrium with the 
glowing walls. A small hole can be drilled through the wall to observe the radiation 
inside without significantly disturbing the system. The observation hole can be 
thought of as a perfect blackbody since any light entering the hole from the 
outside is eventually absorbed (before being potentially reemitted) , if not on the 



1 An important exception is atomic vapors, which have relatively few discrete spectral lines. 
However, Kirchhoff 's assumption holds quite well for most solids, which are sufficiently complex. 



Gustav Kirchhoff (1824-1887, German) 
was born in Konigsberg, the son of a 
lawyer. Kirchhoff attended the Univer- 
sity of Konigsberg. While still a student, 
he developed what are now called Kirch- 
hoff's law for electrical circuits. During 
his career, Kirchhoff was a professor in 
Breslau, Heidelberg, and finally Berlin. 
Kirchhoff was one of the first to study 
the spectra emitted by various objects 
when heated. Not coincidentally, his 
colleague in heidelberg was Robert 
Bunsen, inventor of the Bunsen burner. 
Kirchhoff coined the term 'blackbody' 
radiation. He demonstrated that an ex- 
cited gas gives off a discrete spectrum, 
and that an unexcited gas surrounding 
a blackbody emitter produces dark lines 
in the blackbody spectrum. Together 
Kirchhoff and Bunsen discovered cae- 
sium and rubidium. Later in his career, 
Kirchhoff showed how to derive Fres- 
nel's diffraction formula starting from 
the wave equation. (Wikipedia) 
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Figure 13.1 Blackbody radiator. 
Thermal light emerges from the 
small hole in the end. 



first bounce then on subsequent bounces inside the cavity. 

In this chapter, we develop a theoretical understanding of blackbody radiation 
and provide some historical perspective. The explanation given by Max Planck 
in 1900 marks the birth of quantum mechanics. He postulated the existence of 
electromagnetic quanta, which we now call photons. Einstein used Planck's ideas 
to explain the photoelectric effect and to develop the concept of stimulated and 
spontaneous emission. Because of his analysis, Einstein can be thought of as the 
father of light amplification by stimulated emission of radiation (LASER). 

13.1 Stefan- Boltzmann Law 

One of the earliest properties deduced about blackbody radiation is known as the 
Stefan-Boltzmann law, first suggested by Stefan in 1879 and derived thermody- 
namically by Boltzmann in 1884. 2 This early (somewhat cumbersome) derivation 
is provided in appendix 13.A. 3 The Stefan-Boltzmann law says that the intensity / 
(including all frequencies) that flows outward from an object's surface is given by 



(13.1) 




Figure 13.2 Blackbody radiator 
constructed as a cavity with a 
small hole to sample the internal 
light. 



where a is called the Stefan-Boltzmann constant and T is the absolute temper- 
ature (in Kelvin) of the surface. The value of the Stefan-Boltzmann constant is 
a = 5.6696 x 10~ 8 W/m 2 -K 4 . The dimensionless parameter e, called the emissivity, 
is equal to one for an ideal blackbody surface. However, it takes on smaller values 
for actual materials because of surface reflections. For example, the emissivity 
of tungsten is approximately e = 0.4. This takes into account surface reflections, 
which make it harder for a material to emit light as well as to absorb light. 4 

As mentioned in the introduction, one can construct an ideal blackbody 
radiator from a material with e < 1 by creating an enclosure, or cavity, as depicted 
in Fig. 13.2. A small hole in the wall behaves to the outside world like an ideal 
blackbody surface. From the perspective of the outside world, the hole's 'surface' 
has emissivity e = 1. Light within the cavity recirculates until it is eventually 
absorbed. The intensity emerging from the hole automatically approaches that of 
an ideal blackbody radiator. 

It is sometimes useful to express intensity in terms of the energy density of 
the light field Mfi e ld (given by (2.53) in units of energy per volume). The connec- 
tion between the intensity emerging from the observation hole in the wall of a 
blackbody cavity and the energy density of the thermal light within the cavity is 



1 = 



C "field 



4a T 



"field : 



(13.2) 



2 See P. W. Milonni, The Quantum Vacuum An Introduction to Quantum Electrodynamics, Sect. 
1.2 (San Diego: Academic Press, 1994). 

3 It is less effort to obtain the Stefan-Boltzmann law using the Planck radiation formula as a 
starting point (see P13.3). 

4 Emissivity typically has some frequency dependence, so what is presented here is an oversim- 
plification. 



13.2 Failure of the Equipartition Principle 
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Within the enclosed cavity, light travels at speed c isotropically in all directions. A 
factor of 1 12 arrises because only half of the energy travels towards the hole from 
within the cavity as opposed to away. The remaining factor of 1/2 occurs because 
the light emerging from the hole is directionally distributed over a hemisphere as 
opposed to flowing in the direction of the surface normal n. The average over the 
hemisphere is carried out as follows: 

In nil 

f d(p f r-nsinOdO 

o o 

In nil 

f d(p f rsinOdO 
o o 

Although (13. 1) describes the total intensity of the light that leaves a blackbody 
surface, it does not describe what frequencies make up the radiation field. This 
frequency distribution was not fully described for another two decades, when 
Max Planck developed his famous formula. Planck was first to arrive at the correct 
formula for the spectrum of blackbody radiation, building on the work of others, 
most notably Wien, who came very close. At first, Planck tweaked Wien's formula 
to match newly available experimental data. When he attempted to explain 
it, he was forced to introduce the concept of light quanta. Even Planck was 
uncomfortable with and perhaps disbelieved the assumption that his formula 
implied, but he deserves credit for recognizing and articulating it. 

13.2 Failure of the Equipartition Principle 

In 1900, Lord Rayleigh attempted to explain the blackbody spectral distribution 
(intensity per frequency) as a function of temperature by applying the equipar- 
tition theorem to the problem. James Jeans gave a more complete derivation in 
1905, which included an overall proportionality constant. They were hopelessly 
behind, since Planck nailed the answer in 1900, but their failed (classical) ap- 
proach is useful pedagogically, and for that reason it gets more attention than it 
deserves. In this section, we also will examine the Rayleigh- Jeans approach to 
illustrate the shortcomings of classical concepts. This will help us better appreci- 
ate the quantum ideas in the following section. As we will see, the Rayleigh- Jeans 
approach actually gets the right answer in the long- wavelength limit. In fairness 
to Rayleigh and Jeans, they represented their formula as being useful only for long 
wavelengths. 

The thermodynamic law of equipartition implies that the energy in a system 
on the average is distributed equally among all degrees of freedom in the system. 
For example, a system composed of oscillators (say, electrons attached to 'springs' 
representing the response of the material on the walls of a blackbody radiator) 
has an energy of k B T/2 for each degree of freedom, where k B = 1.38 x 10~ 23 J/K 
is Boltzmann's constant. Rayleigh and Jeans supposed that each unique mode 
of the electromagnetic field should carry energy k B T just as each mechanical 
spring in thermal equilibrium carries energy k B T {k B 772 as kinetic and k B 772 as 



In nil 
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In nil 
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(13.3) 
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Figure 13.3 The volume of a thin 
spherical shell in n, m, ( space. 



potential energy) . The problem then reduces to that of finding the number of 
unique modes for the radiation at each frequency. 5 The idea is that requiring each 
mode of electromagnetic energy to hold energy k s T should reveal the spectral 
shape of blackbody radiation. 

Number of Modes in an Electromagnetic Field 



kl + ky+kj. No- 



Each frequency is associated with a specific wave number k - 
tice that there are many ways (i.e. combinations of k x , k y , and k z ) to come up 
with the same wave number k-uilc (corresponding to a single frequency v). To 
count these ways properly, we can let our experience with Fourier series guide 
us. Consider a box with each side of length L. The Fourier theorem (0.42) states 
that the total field inside the box (no matter how complicated the distribution) 
can always be represented as a superposition of sine (and cosine) waves. The total 
field in the box can therefore be written as 6 



oo oo 
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(13.4) 



where each component of the wave number in any of the three dimensions is an 
integer times 

k = Inl L (13.5) 

Considering a box of size L does not artificially restrict our analysis, since we may 
later take the limit L — ► oo so that our box represents the entire universe. Moreover, 
L will naturally disappear from our calculation when we later consider the density 
of modes. 

We can think of a given wave number k as specifying the equation of a sphere in a 
coordinate system with axes labeled n, m, and £: 



(13.6) 



The fact that the integers n, m, and I range over both positive and negative values 
automatically takes into account that the field may travel in the forwards or the 
backwards direction. 

We need to know how many more ways there are to choose n, m, and ( when the 
wave number k/k increases to [k + dk)lk a . The answer is the difference in the 
volume of the two spheres shown in Fig. 13.3: 



# modes in [k,k+dk) = 4n 



k z \ dk 



(13.7) 



This is the number of terms in (13.4) associated with a wave number between k 
and k+ dk. 



5 See O. Svelto, Principles of Lasers, 4th ed., translated by D. C. Hanna, Sect. 2.2.1 (New York: 
Plenum Press, 1998). 

6 The Fourier expansion 13.4 implies that the field on the right and left of each dimension match 
up, which is known as periodic boundary conditions. 
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According to the Rayleigh-Jeans assumption, each mode should carry on 
average equal energy k B T. The energy density associated with a specified range of 
wave numbers d k is then k B T/L 3 times the number of modes within that range 
(13.7). 

The total energy density in the field involving all wave numbers is then 7 



"field : 



/ 



k B T Ank 2 J7 
2 x — — x — — dk - 



, C k2 



dk 



(13.8) 



where the extra factor of 2 accounts for two independent polarizations, not speci- 
fied in (13.4). As anticipated, the dependence on L has disappeared from (13.8) 
after substituting from (13.5). 

We can immediately see that (13.8) disagrees drastically with the Stefan- 
Boltzmann law (13.2), since (13.8) is proportional to temperature rather than 
to its fourth power. In addition, the integral in (13.8) is seen to diverge, meaning 
that regardless of the temperature, the light carries infinite energy density! This 
has since been named the ultraviolet catastrophe since the divergence occurs 
on the short wavelength end of the spectrum. This is a clear failure of classical 
physics to explain blackbody radiation. Nevertheless, Rayleigh emphasized the 
fact that his formula works well for the longer wavelengths. 

It is instructive to make the change of variables k = ml c in the integral to write 



"field = k B T 



(13.9) 



The important factor oj 2 In 2 c 3 can now be understood to be the number of modes 
per frequency. Then (13.9) is rewritten as 



C.XJ 

"field = j pid))dLD 



where 



P Rayleigh-Jeans ( w ) — k B T 



CD 



7t 2 C 3 



(13.10) 



(13.11) 



describes (incorrectly) the spectral energy density of the radiation field associated 
with blackbody radiation. 




James Jeans (1877-1946, English) was 
born in Ormskirk, England. He attended 
Cambridge University and later taught 
there for most of his career. He also 
taught at Princeton University for a 
number of years. One of his major con- 
tributions was the development of Jeans 
length, the critical radius for interstel- 
lar clouds, which determines whether 
a cloud will collapse to form a star. In 
his later career, Jeans became some- 
what well known to the public for his 
lay-audience books highlighting scien- 
tific advances, in particular relativity 
and cosmology. (Wikipedia) 



13.3 Planck's Formula 

In the late 1800's as spectrographic technology improved, experimenters acquired 
considerable data on the spectra of blackbody radiation. For the first time, de- 
tailed maps of the intensity per frequency associated with blackbody radiation 

7 See O. Svelto, Principles of Lasers, 4th ed., translated by D. C. Hanna, Sect. 2.2.2 (New York: 
Plenum Press, 1998). 
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Figure 13.4 Energy density per 
frequency according to Planck, 
Wien, and Rayleigh-Jeans. 
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Wilhelm Wien (1864-1928, Country) 
was born in Gaffken, Prussia (now Pri- 
morsk, Russia). As a teenager, he at- 
tended schools in Rastenburg and then 
Heidelberg. He later attended the Uni- 
versity of Gottingen and then the Uni- 
versity of Berlin. In 1886, he received 
his Ph.D. after working under Hermann 
von Helmholtz where he studied the in- 
fluence of materials on the color of light. 
In 1896 Wien developed an empirical 
formula for the spectral distribution of 
blackbody radiation. He collaborated 
with Planck, who gave the law a foun- 
dation in electromagnetic and thermody- 
namic theory. Planck later improved the 
formula, whereupon it became known by 
his name. However, Wien's formula for 
the peak wavelength of the blackbody 
curve, called Wien's displacement law, 
remains valid. In 1898, Wien identified 
a positive particle equal in mass to the 
hydrogen atom, which was later named 
the proton. Wien received the Nobel 
prize in 1911 for his work on heat and 
radiation. (Wikipedia) 



became available over a fairly wide wavelength range. In keeping with Kirchhoff's 
notion of an ideal blackbody radiator, the results were observed to be indepen- 
dent of the material for most solids. The intensity per frequency depended only 
on temperature and when integrated over all frequencies agreed with the Stefan- 
Boltzmann law (13.1). 

In 1896, Wilhelm Wien considered the known physical and mathematical 
constraints on the spectrum of blackbody radiation and proposed a spectral 
function that seemed to work: 8 



Pwien M 



Hoj 3 e -hcolk B T 



7T 2 C 3 



(13.12) 



An important feature of (13.12) is that it gives a result proportional to T 4 when 
integrated over all frequency co (i.e. the Steffan-Boltzmann law). 

Wien's formula did a fairly good job of fitting the experimental data. However, 
in 1900 Lummer and Pringshein, colleagues of Max Planck, reported experimental 
data that deviated from the Wien distribution at long wavelengths (infrared). 
Planck was privy to this information early on and introduced a modest revision 
to Wien's formula that fit the data beautifully everywhere: 



PHanck ( w ) - 
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(13.13) 



where H = 1.054 x 10~ 34 J- s is an experimentally determined constant. 9 

Figure 13.4 shows the Planck spectral distribution curve together with the 
Rayleigh-Jeans curve (13.11) and the Wien curve (13.12). As is apparent, the Wien 
distribution does a good job nearly everywhere. However, at long wavelengths 
it was off by just enough for the experimentalists to notice that something was 
wrong. 

At this point, it may seem fair to ask, what did Planck do that was so great? 
After all, he simply guessed a function that was only a slight modification of 
Wien's distribution. And he knew the 'answer from the back of the book', namely 
Lummer's and Pringshein's well done experimental results. (At the time, Planck 
was unaware of the work by Rayleigh.) 

Planck gets well-deserved credit for interpreting the meaning of his new 
formula. His interpretation was what he called an "act of desperation." He did 
not necessarily believe in the implications of his formula; in fact, he presented 
them somewhat apologetically. It was several years later that the young Einstein 
published his paper explaining the photoelectric effect in the context of Planck's 
work. 

Planck's insight was an enormous step toward understanding the quantum 
nature of light. Nevertheless, it took another three decades to develop a more 



The constant h had not yet been introduced by Planck. The actual way that Wien wrote his 
distribution was pwien lco) = a( i? e ~ T > where a and b were parameters used to fit the data. 

9 Planck's constant was first introduced as h = 6.626 x 10 _34 J ■ s, convenient for working with 
frequency v, expressed in Hz. It is common to write h = hlln when working with frequency co, 
expressed in rad/s. 
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complete theory of quantum electrodynamics. Students should appreciate that 
the very people who developed quantum mechanics were also bothered by its 
confrontation with deep-seated intuition. If quantum mechanics bothers you, 
you are in good company! 

Planck found that he could derive his formula only if he made the following 
strange assumption: A given mode of the electromagnetic field is not able to 
carry an arbitrary amount of energy (for example, k B T as Rayleigh and Jeans 
used, which varies continuously as the temperature varies). Rather, the field 
can only carry discrete amounts of energy separated by spacing Ha). Under this 
assumption, the probability P n that a mode of the field is excited to the n th level 
is proportional to the Boltzmann statistical weighting factor e -™o>lkaT m a review 
of the Boltzmann factor is given in Appendix 13. B. 

Probable Energy in Each Field Mode 

The Boltzmann factor can be normalized by dividing by the sum of all such factors 
to obtain the probability of having energy nhu> in a particular mode: 



-nhbilknT 
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(13.14) 



We used (0.66) to accomplish the above sum, which is a geometric series. 

The expected energy in a particular mode of the field is the sum of each possible 
energy level (i.e. nhco) times the probability of it occurring: 
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(13.15) 



d(Ha)lk B T) i-e-ffo/kBT 



e flbilk B T _ i 

We used (0.66) again as well as a clever derivative trick. 

Equation (13.15) provides the expected energy in any of the modes of the radi- 
ation field, as dictated by Planck's assumption. To obtain the Planck distribution 
(13.13), we replace k B T in the Rayleigh-Jeans formula (13.10) with the correct 
expected energy (13. 15). 10 

It is interesting that we are now able to derive the constant in the Stefan- 
Boltzmann law (13.2) in terms of Planck's constant H (see P13.3). The Stefan- 
Boltzmann law is obtained by integrating the spectral density function (13.13) 



10 See O. Svelto, Principles of Lasers, 4th ed., translated by D. C. Hanna, Sect. 2.2.2 (New York: 
Plenum Press, 1998). 




Max Planck (1858-1947, German) 
was born in Kiel, the sixth child in his 
family. His father was a law professor. 
When Max was about nine years old, 
his family moved to Munich where he 
attended gymnasium. A mathematician, 
Herman Muller took an interest in his 
schooling and tutored him in mechanics 
and astronomy. Planck was a gifted 
musician, but he decided to pursue a 
career in physics. At age 16 he enrolled 
in the University of Munich. By age 22, 
he had finished his doctoral dissertation 
and habilitation thesis. He was initially 
ignored by the academic community and 
worked for a time as an unpaid lecturer. 
He became an associate professor of 
theoretical physics at the University of 
Kiel and then a few years later took 
over Kirchhoff's post at the University 
of Berlin. After nearly twenty years of 
idillic and happy family life, a series 
of tragedies hit the Planck household. 
Planck's first wife and mother of four, 
died. Then his eldest son was killed 
in action during World War I. Soon 
after, his twin daughters each died 
giving birth to their first child. Later 
Planck's remaining son from his first 
marriage was executed for participating 
in a failed attempt to assassinate Hitler. 
Planck won the Nobel prize in 1918 for 
his introduction of energy quanta, but 
he had serious reservations about the 
course that quantum mechanics theory 
took. (Wikipedia) 
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over all frequencies to obtain the total field energy density, which is in thermal 
equilibrium with the blackbody radiator: 

f 4 n 2 kl 4 4 

"field = / Ppianck i(o)do) = - 60{ , 2 ^ 3 T = -aT (13.16) 



Since Planck's constant was not introduced until a couple decades after the Stefan- 
Boltzmann law was developed, one might more appropriately say that the Stefan - 
Boltzmann constant pins down Planck's constant. 



Example 13.1 

Determine p Pllmcii (A) suc h that 



"field = J PHanck ( w ) d(D - J p P | a nck W dA 



where pvumck (fij) and p Planck (A) represent distinct functions distiguished by their 
arguments. 

Solution: The change of variables A = 2nclw => dt) = -2ncdAI A 2 gives 

q CXD 

r h{2nclAy I dA\_ f Whc 

" field = J ^2 c 3r e /j(27tc/A)/fc B r_ 1 i {~ 2nC ~A2)^J x5\ e 2jthcnk B T -\] dX 

oo L J L J 

By inspection, we get 

8nhc 

PFlanck (A) - JEJ^hdlksFZ^ (13 - 1?) 

where we have written h = 2nh. It is interesting to note that the maximum of 
Ppianck (A) occurring at A max and the maximum of pp^,* («) occurring at <y max do not 
correspond to a matching wavelength and frequency. That is, A max ^ 27rc/&j max , 
because of the nonlinear nature of the variable transformation. (See problem 
P13.4.) 



1 3 .4 Einstein's A and B Coefficients 

More than a decade after Planck introduced his formula, and after Niels Bohr 
had proposed that electrons occupy discrete energy states in atoms, Einstein 
reexamined blackbody radiation in terms of Bohr's new idea. If the material of a 
blackbody radiator interacts with a particular mode of the field with frequency a>, 
then electrons in the material must make transitions between two energy levels 
with energy separation ha). Since the radiation of a blackbody is in thermal equi- 
librium with the material, Einstein postulated that the field stimulates electron 
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transitions between energy levels. In addition, he postulated that some transi- 
tions must occur spontaneously. (If the possibility of spontaneous transitions is 
not included, then there can be no way for a field mode to receive energy if none 
is present to begin with.) 

Einstein wrote down rate equations for populations of the two levels N\ and 
N 2 associated with the transition Ha): 11 



Ni = A 2 iN 2 - B l2 p {co) Ni + B21P M N 2 , 
N 2 = -A21N2 + B l2 p {co) N\ - B 2 ip {co) N 2 



(13.18) 



The coefficient A21 is the rate of spontaneous emission from state 2 to state 1, 
B12P {co) is the rate of stimulated absorption from state 1 to state 2, and B21P {co) 
is the rate of stimulated emission from state 2 to state 1. 

In thermal equilibrium, the rate equations (13.18) are both equal to zero (i.e., 
N\ = N 2 = 0), since the relative populations of each level must remain constant. 
We can then solve for the spectral density p {co) at the given frequency. In this case, 
either expression in (13.18) yields 



p{co) 



A21 



(13.19) 



^12" #21 

In thermal equilibrium, the spectral density must match the Planck spectral 
density formula (13.13). In making the comparison, we should first rewrite the 
ratio 7V~i / A/2 of the populations in the two levels using the Boltzmann probability 
factor (see Appendix 13.B): 
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(13.20) 



Then when equating (13.19) to the Planck blackbody spectral density (13.13) we 
get 

A 2 i Hco 3 



From this expression we deduce that 12 



B12 - B21 



and 



A21 = 
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(13.21) 



(13.22) 



(13.23) 



We see from (13.22) that the rate of stimulated absorption is the same as the 
rate of stimulated emission. In addition, if one knows the rate of stimulated 



1 1 See P. W. Milonni, The Quantum Vacuum An Introduction to Quantum Electrodynamics, Sect. 
1.8 (San Diego: Academic Press, 1994). 

12 We assume that energy levels 1 and 2 are non- degenerate. Some modifications must be made 
in the case of degenerate levels, but the procedure is similar. 




Albert Einstein (1879-1955, German) 
is without a doubt the most famous sci- 
entist in history. Time Magazine named 
him Person of the Century. Born in 
Ulm to a (non practicing) lewish fam- 
ily, young Albert was influenced by a 
medical student, Max Talmud, who took 
meals with his family and enthusiasti- 
cally introduced the 10-year-old Albert 
to geometry and other topics. Einstein's 
father wanted Albert to be trained as an 
electrical engineer, but Albert clashed 
with his teachers in that program and 
withdrew. Einstein then attended school 
in Switzerland, and subsequently en- 
tered a mathematics program at the 
Polytechnic in Zurich. There, Einstein 
met his first wife, Mileva Marie, a fellow 
math student, who he later divorced 
before marrying Elsa Lowenthal. Early 
on, Einstein could not find a job as 
a professor, and so he worked in the 
Swiss patent office until his "Miracle 
Year" (1905), when published four ma- 
jor papers, including relativity and the 
photoelectric effect (for which he later 
received the Nobel prize). Thereafter, 
job offers were never in short supply. 
In 1933, as the Nazi regime came to 
power, Einstein immigrated from Berlin 
to the US and became a professor at 
Princeton University. Einstein is most 
noted for special and general relativity, 
for which he became a celebrity scientist 
in his own lifetime. Einstein also made 
huge contributions to statistical and 
quantum mechanics. (Wikipedia) 
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emission between a pair of states, it follows from (13.23) that one also knows the 
rate of spontaneous emission. This is remarkable because to derive A21 directly, 
one needs to use the full theory of quantum electrodynamics (the complete 
photon description). However, to obtain B21, it is actually only necessary to use a 
semiclassical theory, where the light is treated classically and the energy levels in 
the material are treated quantum-mechanically using the Schrodinger equation. 

In writing the rate equations, (13.18), Einstein predicted the possibility of 
creating lasers fifty years in advance of their development. These rate equations 
are still valid even if the light is not in thermal equilibrium with the material. 
The equations suggest that if the population in the upper state 2 can be made 
artificially large, then amplification will result via the stimulated transition. The 
rate equations also show that a population inversion (more population in the 
upper state than in the lower one) cannot be achieved by 'pumping' the material 
with the same frequency of light that one hopes to amplify. This is because the 
stimulated absorption rate is balanced by the stimulated emission rate. The 
material- dependent parameters A21 and Bu = £21 are called the Einstein A and B 
coefficients. 



Appendix 13.A Thermodynamic Derivation of the Stefan- 
Boltzmann Law 

In this appendix, we derive the Stefan-Boltzmann law without relying on the 
Planck blackbody formula. 13 This derivation is included mainly for historical 
interest. The derivation relies on the 1st and 2nd laws of thermodynamics. 

Consider a container whose walls are all at the same temperature and in 
thermal equilibrium with the radiation field inside. Notice that the units of energy 
density «fi e id (energy per volume) are equivalent to force per area, or in other 
words pressure. It turns out that the radiation exerts a pressure of 

P=w fi ei d /3 (13.24) 

on the walls of the container. This can be derived from the fact that radiation of 
energy AE imparts a momentum 

\p = — cost? (13.25) 
c 

when it is absorbed with incident angle on a surface. 14 A similar momentum is 
imparted when radiation is emitted. 



u See P. W. Milonni, The Quantum Vacuum An Introduction to Quantum Electrodynamics, Sect. 
1.2 (San Diego: Academic Press, 1994). 

14 The fact that light carries momentum was understood well before the development of the 
theory of relativity and the photon description of light. 
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Derivation of (13.24) 



Consider a thin layer of space adjacent to a container wall with area A. If the layer 
has thickness Az, then the volume in the layer is AAz. Half of the radiation inside 
the layer flows toward the wall, where it is absorbed. The total energy in the layer 
that will be absorbed is then AE = {AAz) wgeid 12, which arrives during the interval 
A t — Azl (c cos 8) , assuming for the moment that all light is directed with angle 8; 
we must average the angle of light propagation over a hemisphere. 

The pressure on the wall due to absorption (i.e. force or dpi dt per area) is then 
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(13.26) 



In equilibrium, an equal amount of radiation is also emitted from the wall. This 
gives an additional pressure P emit = P abs , which confirms that the total pressure is 
given by (13.24). 

We derive the Stefan-Boltzmann law using the concept of entropy, which is 
defined in differential form by the quantity 

dQ 



dS = 




Figure 13.5 Field inside a black- 
body radiator. 



(13.27) 



where dQ is the injection of heat (or energy) into the radiation field in the box 
and T is the temperature at which that injection takes place. We would like to 
write dQ in terms of Wfi e id> V> and T. Then we may invoke the fact that S is a state 
variable, which implies 

d 2 S d 2 S 
■^r- = (13-28) 
dTdV dVdT 

This is a mathematical statement of the fact that S is fully defined if the internal 
energy, temperature, and volume of a system are specified. That is, S does not 
depend on past temperature and volume history; it is dictated by the present 
state of the system. 

To obtain dQ in the form that we need, we can use the 1st law of thermody- 
namics. It states that a change in internal energy dU = d (i/fi e id^) can take place 
by the injection of heat dQ or by doing work dW = PdV as the volume increases: 



dQ = dU + PdV = d (Mfl e id^) + PdV 

= VdUft e \ A + Ufieldd^+ - "field dV 

rfUfieid 4 
= V dT+ -Ufield"^ 

dT 3 



(13.29) 



We have used energy density times volume to obtain the total energy U in the radi- 
ation field in the box. We have also used (13.24) to obtain the work accomplished 
by pressure as the volume changes. 
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We can use (13.29) to rewrite (13.27) as 

ds= Vdu^a dT+ 4u^a dv 
T dT 3T 

When we differentiate (13.30) with respect to temperature or volume we get 

dS _ 4u field 

dV 37 (13.31) 
dS V ^Kfield 

df ~ T dT 

We are now able to evaluate the partial derivatives in (13.28), which give 

d 2 S _ 4 d Mfield _4 1 dUfield 4 Ufield 
3TdV~3dT T ~3 T dT 3 T 2 (13 32) 

d 2 S 1 ^Wfieid 

dVdT ~ f dT 

Since by (13.28) these two expressions must be equal, we get a differential 
equation relating the internal energy of the system to the temperature: 

4 1 dMfield 4 M fie i d _ 1 <jz/fi e ld & Afield _ 4Ufi e ld n 3 331 

3 T dT 3 T 2 ~ T dT ^ dT ~ T 

The solution to this differential equation is (13.2), where 4a Ic is a constant to be 
determined experimentally. 

Appendix 13.B Boltzmann Factor 

The entropy of an object is defined by 

S obj = k B lnn obi (13.34) 

which depends on the number of configurations n ohj for a given state (defined, 
for example, by fixed energy and volume). Now imagine that the object is placed 
in contact with a very large thermal reservoir. The 'object' could be the electro- 
magnetic radiation inside a hollow blackbody apparatus, and the reservoir could 
be the walls of the apparatus, capable of holding far more energy than the light 
field can hold. The condition for thermal equilibrium between the object and the 
reservoir is 

— = — — = - (13.35) 
dt/ bj dU res T 

where temperature has been introduced as a definition, which is consistent with 
(13.27). 

The total number of configurations for the combined system is N = n obi n ies , 
where n ohi and n res are the number of configurations available within the object 
and the reservoir separately. A thermodynamic principle is that all possible 
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configurations are equally probable. In thermal equilibrium, the probability for a 
given configuration in the object is therefore proportional to 

Poi = n res = e WfcB (13.36) 

where we have invoked (13.34). 

Meanwhile, a Taylor's series expansion of S res yields 

[U ies -U^) + ... (13.37) 

Higher order terms are not needed since we assume the reservoir to be very large 
so that it is disturbed only slightly by variations in the object. Since the overall 
energy of the system is fixed, we may write 

U„ - LP q s = At/ res = -A£/ obj (13.38) 

where A[/ obj is a small change in energy in the object. When (13.35), (13.37), and 

(13.38) are introduced into (13.36), the probability for the specific configuration 

—s fc/ eq i AU " b ' 
becomes P oc e *fc resl KS) V , or simply 

Way 

Poc e W (13.39) 

since the first term in the exponent is constant. Alf obj represents an amount 
energy added to the object to establish a configuration. In the case of blackbody 
radiation, a mode takes on energy A C/ obj = nHco, where n is the number of energy 
quanta in the mode. The probability that a mode carries energy nha> is therefore 

proportional to e V . 



Sres (t^res) — Sres (^res) ~*~ 



dU re 
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Exercises 



Exercises for 13.1 Stefan- Boltzmann Law 

P13.1 The Sun has a radius of ^ s = 6.96 x 10 8 m. What is the total power that 
it radiates, given a surface temperature of 5750 K? 

P13.2 A 1 cm-radius spherical ball of polished gold hangs suspended inside 
an evacuated chamber that is at room temperature 20° C. There is no 
pathway for thermal conduction to the chamber wall. 



(a) If the gold is at a temperature of 100°C, what is the initial rate of 
temperature loss in °C/s? The emissivity for polished gold is e = 0.02. 
The specific heat of gold is 129 J/kg-°C and its density is 19.3 g/cm 3 . 

HINT: Q = mcAT and Power = Q/At. 

(b) What is the initial rate of temperature loss if the ball is coated with 
flat black paint, which has emissivity e = 0.95? 

HINT: You should consider the energy flowing both ways. 



Exercises for 13.3 Planck's Formula 

P13.3 Derive (or try to derive) the Stefan-Boltzmann law by integrating the 



(a) Rayleigh-Jeans energy density 



oo 








Please comment. 



(b) Wien energy density 



oo 








Please evaluate a. 



HINT: f x i e~ ax dx=\. 

J a* 







(c) Planck energy density 



oo 








Please evaluate a. Compare results of (b) and (c). 



HINT: fg$ = £. 
o 
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P13.4 (a) Derive Wien's displacement law 

_ 0.00290 m-K 

which gives the strongest wavelength present in the blackbody spectral 
distribution. 

HINT: See Example 13.1. You may like to know that the solution to the 
transcendental equation (5 - x) e x = 5 is x = 4.965. 

(b) What is the strongest wavelength emitted by the Sun, which has a 
surface temperature of 5750 K (see P13.1)? 

(c) Also find v max and show that it is not the same as cl A max . Why would 
we be interested mainly in A max ? 
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Physical Constants 



Constant 


Symbol 


Value 


Permittivity 


e 


8.8542 xlO" 12 C 2 /N-m 2 


Permeability 


Mo 


An x 10" 7 T • m/A (or kg • m /C 2 ) 


Speed of light in vacuum 


c 


2.9979 x 10 8 m/s 


Charge of an electron 


q e 


1.602 x 10" 19 C 


Mass of an electron 


m e 


9.108 xlO" 31 kg 


Boltzmann's constant 


kft 


1.380 x 10" 23 J/K 


Plancks constant 


h 


6.626 x 10 -34 J-s 




h 


1.054 x 10" 34 J-s 


Stefan-Boltzmann constant 


a 


5.670 x 10" 8 W/m 2 -K 4 



